Hey guys, Hope you all are doing great.

I have always found Regular expressions or REGEX quite amazing. You can solve your string related problem in just some simple magic of patterns.
Our brain works very well to identify the pattern in our day to day life. As these are visual clues, we tend to catch and analyse them better. In fact, their you make negligible mistake to tackle the scenario.
Say you have a pattern with concentric cirles:


Now you see that it has got a pattern in here, the small circle grows by some radius say x and we have another circle and so on...
This can be generalized in mathematical way, as in geometry is:
x + y + 2gx + 2fy + c' = 0
This same thing applies to string patterns or character arrays.
For Python we have a Regex module called re

Below are few function which you will find useful is many applications 

Syntax Definition Examples
re.search(regex,str) return match object if found, else None
pika = re.search(r"\w+@\w+\.com", "from pikachu@pokemon.com address")

if pika :
    print "yes"
    print pika.group() # → pika@pika.com
else:
    print "no"


re.match(regex,str) similar to re.search(), but match starts at beginning of string.

goku = re.match('ha','kamehameha') # succeed

if goku == None:
    print "no match"
else:
    print "yes match"


re.split(regex,str) return a list.

print re.split(r' +', 'Clark   Kent  is Superman')'
# output: ['Clark', 'Kent', 'is', 'Superman']

print re.split(r'( +)(@+)', 'what   @@do  @@you @@think')
# output: ['what', '   ', '@@', 'do', '  ', '@@', 'you', ' ', '@@', 'think']

print re.split(r' ', 'a b c d e', maxsplit = 2)
# output: ['a', 'b', 'c d e']


re.findall(regex,str) return a list of non-overlapping (repeated) matches.

print re.findall(r'( +)(@+)', 'what   @@@do  @@you @think')
# output: [('   ', '@@@'), ('  ', '@@'), (' ', '@')]


re.finditer(…) similar to re.findall(), but returns a iterator.

for matched in re.finditer(r'(\w+)', 'where   are  the avengers'):
    print matched.group()       # prints each word in a line


re.sub(regex,replacement,str) does replacement. Returns the new string.

def ff(pika):
    if pika.group(0) == "bulba":
        return "vena"
    elif pika.group(0) == "mender":
        return "izard"
    else:
        return pika.group(0)

print re.sub(r"[aeiou]+", ff, "bulbasaur") # venasaur
print re.sub(r"[aeiou]+", ff, "charmender") # charizard
print re.sub(r"[aeiou]+", ff, "geek") # geek


re.subn(…) similar to re.sub(), but returns a tuple. 1st element is the new string, 2nd is number of replacement.

re.subn('\w+', 'Try', 'Cry Cry, Cry and Cry Again!', count=3)
# ('Try Try, Try and Try Again!', 3)
re.escape(str) add backslash to string for feeding it to regex as pattern. Return the new string.

re.escape('Lets meet spider man?')
# output: Lets\\ meet\\ spider\\ man\\?

To form a regular expression, it needs a bit or creativity and foresightedness of the outcome.
For sure you need to know the regex symbols and rules to do that, and all should be dancing in your mind to create an efficient and effective pattern. To help you out, below is the Python Regex Jump-starter:

Python Regex Jump-starter:
Regular Expression Basics
. Any character except newline
a The character a
ab The string ab
a|b a or b
a* 0 or more a's
\ Escapes a special character
Regular Expression Quantifiers
* 0 or more
+ 1 or more
? 0 or 1
{2} Exactly 2
{2, 5} Between 2 and 5
{2,} 2 or more
(,5} Up to 5
Default is greedy. Append ? for reluctant.
Regular Expression Groups
(...) Capturing group
(?P...) Capturing group named Y
(?:...) Non-capturing group
\Y Match the Y'th captured group
(?P=Y) Match the named group Y
(?#...) Comment
Regular Expression Character Classes
[ab-d] One character of: a, b, c, d
[^ab-d] One character except: a, b, c, d
[\b] Backspace character
\d One digit
\D One non-digit
\s One whitespace
\S One non-whitespace
\w One word character
\W One non-word character
Regular Expression Assertions
^ Start of string
\A Start of string, ignores m flag
$ End of string
\Z End of string, ignores m flag
\b Word boundary
\B Non-word boundary
(?=...) Positive lookahead
(?!...) Negative lookahead
(?<=...) Positive lookbehind
(?<!...) Negative lookbehind
(?()|) Conditional
Regular Expression Flags
i Ignore case
m ^ and $ match start and end of line
s . matches newline as well
x Allow spaces and comments
L Locale character classes
u Unicode character classes
(?iLmsux) Set flags within regex
Regular Expression Special Characters
\n Newline
\r Carriage return
\t Tab
\YYY Octal character YYY
\xYY Hexadecimal character YY
Regular Expression Replacement
\g<0></0> Insert entire match
\g Insert match Y (name or number)
\Y Insert group numbered Y

 Please let me know your feedback, on how i can improve and make it better. Till then!! Cheers!!
1

View comments

Popular Posts
Popular Posts
print ("About Me")
print ("About Me")
My Photo
HYDERABAD, Telangana, India
Blog.History(All)
Labels
Total Pageviews
Total Pageviews
21466
About Me
About Me
Loading
Dynamic Views theme. Powered by Blogger. Report Abuse.