Regular expressions Basics

A regular expression is a pattern of characters(meta characters and special characters).

Where are regular expressions used.

searchString = "Sachin tendulkar is the master blaster. Sachin lives in Mumbai and likes to play cricket."

\ - indicates that the next character would be a special character, a literal or a backreference

^ - Input String should be matched at the beginning.

$ - Input String should be matched at the end.

* - Matches the preceding character zero or more times. It is same as {0,}.

+ - Matches the preceding character one or more times. It is same as {1,}.

? - Matches the preceding character zero or one time. It is same as {0,1}

{i} - Matches the previous character exactly i times.

{i,} - Matches the previous character at least i times and at most any time.

{i,j} -Matches the previous character at least i times and at the most j times.

. - Matches any single character except "\n".

(pattern) - Matches pattern and captures the match that can be used in backreferences.

p|q - Matches either p or q. Please note that p and q could be more complex regular expressions

[pqr] - A character set. Matches any one of the character inside the brackets.

[^pqr] - A negative character set. Matches any character not inside the brackets.

[p-z] - A range of characters. Matches any character in the specified range i.e p,q,r,....x,y,z.

[^p-z] - A negative range characters. Matches any character not in the specified range i.e. a,b,c...m,n,o

\b - Matches the boundary of the word

\B - Matches middle part of the word.

\d - Matches a digit character. same as [0-9].

\D - Matches a nondigit character. same as [^0-9].

\f , \n and \r - Matches a form-feed character, newline and carriage character.

\s - Matches any white space character including space, tab, form-feed. Equivalent to [ \f\n\r\t\v].

\S - Matches any non-white space character. Equivalent to [^ \f\n\r\t\v].

\t , \v - Matches a horizontal and vertical tab character.

\w - Matches alpha numeric character including underscore. Equivalent to '[A-Za-z0-9_]'.

\W - Matches any non - alpha numeric character. Equivalent to '[^A-Za-z0-9_]'.

\number- A reference back to captured matches.

*********************************************************************************

*********************************************************************************

**:**__Definition__A regular expression is a pattern of characters(meta characters and special characters).

**:**

__General Applications of Regular Expressions__Where are regular expressions used.

- Pattern Matching in Strings
- To find the occurrences of one string/pattern in given string.
- To replace the patterns with another string in a given text

All programming languages support the use of regular expressions.

**:**

__Examples on Regular Expressions__
As I said earlier We use regular expressions to check if the given string matches the specified pattern.

For example - Consider a scenario where you have to validate that the given string should be a valid email address.

So list of valid email addresses are - reply2sagar@gmail.com, ayx@jjjj.in etc

Some of the invalid email addresses are - kjkjj@fdff, @dfdf.in etc

With the help of regular expression, you can easily validate the email address.

To write any VBScript program involving regular expressions, you will have to follow below steps.

For example - Consider a scenario where you have to validate that the given string should be a valid email address.

So list of valid email addresses are - reply2sagar@gmail.com, ayx@jjjj.in etc

Some of the invalid email addresses are - kjkjj@fdff, @dfdf.in etc

With the help of regular expression, you can easily validate the email address.

__Syntax of Regular Expression in VBScript:__To write any VBScript program involving regular expressions, you will have to follow below steps.

- Create a regular expression object (RegExp)
- Define the pattern using RegExp object's pattern property.
- Use test method to check whether the given string matches with the pattern specified in step 2.

'Create the regular expression object Set myRegEx = New RegExp 'Specify the pattern (Regular Expression) myRegEx.Pattern = "[a-z0-9]+@[a-z]+\.[a-z]+" 'Specify whether the matching is to be done with case sensitivity on or off. myRegEx.IgnoreCase = True 'Use Test method to see if the given string is matching with the pattern isMatched = myRegEx.Test("reply2sagar@gmail.com") Variable isMatched will be true if the string "reply2sagar@gmail.com" matches with the given pattern "[a-z0-9]+@[a-z]+\.[a-z]+"

__Another example on Regular Expression.__searchPattern = "Sachin" Set reObject= New RegExp ' Create a regular expression. reObject.Pattern = searchPattern ' Set pattern. reObject.IgnoreCase = True ' Set case insensitivity. reObject.Global = True ' Set global applicability. Set Matches = reObject.Execute(searchString) ' Execute search. For Each M in Matches Str = Str & M.Firstindex & " -> " & M.Value & vbCRLF Next Msgbox Str Msgbox "String after replacing -> " & vbcrlf & reObject.replace(searchString,"Arjun")

__Below is the list of all meta characters used in regular expressions in VBScript__\ - indicates that the next character would be a special character, a literal or a backreference

^ - Input String should be matched at the beginning.

$ - Input String should be matched at the end.

* - Matches the preceding character zero or more times. It is same as {0,}.

+ - Matches the preceding character one or more times. It is same as {1,}.

? - Matches the preceding character zero or one time. It is same as {0,1}

{i} - Matches the previous character exactly i times.

{i,} - Matches the previous character at least i times and at most any time.

{i,j} -Matches the previous character at least i times and at the most j times.

. - Matches any single character except "\n".

(pattern) - Matches pattern and captures the match that can be used in backreferences.

p|q - Matches either p or q. Please note that p and q could be more complex regular expressions

[pqr] - A character set. Matches any one of the character inside the brackets.

[^pqr] - A negative character set. Matches any character not inside the brackets.

[p-z] - A range of characters. Matches any character in the specified range i.e p,q,r,....x,y,z.

[^p-z] - A negative range characters. Matches any character not in the specified range i.e. a,b,c...m,n,o

\b - Matches the boundary of the word

\B - Matches middle part of the word.

\d - Matches a digit character. same as [0-9].

\D - Matches a nondigit character. same as [^0-9].

\f , \n and \r - Matches a form-feed character, newline and carriage character.

\s - Matches any white space character including space, tab, form-feed. Equivalent to [ \f\n\r\t\v].

\S - Matches any non-white space character. Equivalent to [^ \f\n\r\t\v].

\t , \v - Matches a horizontal and vertical tab character.

\w - Matches alpha numeric character including underscore. Equivalent to '[A-Za-z0-9_]'.

\W - Matches any non - alpha numeric character. Equivalent to '[^A-Za-z0-9_]'.

\number- A reference back to captured matches.

*********************************************************************************

*********************************************************************************

**Some examples on regular expressions:**- To match the 10 digit mobile number -> \d{10}
- To match email address -> \w+@\w+\.\w+

## No comments:

## Post a Comment