Regular expressions Basics
Definition :
A regular expression is a pattern of characters(meta characters and special characters).
General Applications of Regular Expressions :
Where are regular expressions used.
- Pattern Matching in Strings
- To find the occurrences of one string/pattern in given string.
- To replace the patterns with another string in a given text
All programming languages support the use of regular expressions.
Examples on Regular Expressions:
As I said earlier We use regular expressions to check if the given string matches the specified pattern.
For example - Consider a scenario where you have to validate that the given string should be a valid email address.
So list of valid email addresses are - reply2sagar@gmail.com, ayx@jjjj.in etc
Some of the invalid email addresses are - kjkjj@fdff, @dfdf.in etc
With the help of regular expression, you can easily validate the email address.
Syntax of Regular Expression in VBScript:
To write any VBScript program involving regular expressions, you will have to follow below steps.
- Create a regular expression object (RegExp)
- Define the pattern using RegExp object's pattern property.
- Use test method to check whether the given string matches with the pattern specified in step 2.
'Create the regular expression object
Set myRegEx = New RegExp
'Specify the pattern (Regular Expression)
myRegEx.Pattern = "[a-z0-9]+@[a-z]+\.[a-z]+"
'Specify whether the matching is to be done with case sensitivity on or off.
myRegEx.IgnoreCase = True
'Use Test method to see if the given string is matching with the pattern
isMatched = myRegEx.Test("reply2sagar@gmail.com")
Variable isMatched will be true if the string "reply2sagar@gmail.com" matches with the given pattern
"[a-z0-9]+@[a-z]+\.[a-z]+"
Another example on Regular Expression.
searchString = "Sachin tendulkar is the master blaster. Sachin lives in Mumbai and likes to play cricket."
searchPattern = "Sachin"
Set reObject= New RegExp ' Create a regular expression.
reObject.Pattern = searchPattern ' Set pattern.
reObject.IgnoreCase = True ' Set case insensitivity.
reObject.Global = True ' Set global applicability.
Set Matches = reObject.Execute(searchString) ' Execute search.
For Each M in Matches
Str = Str & M.Firstindex & " -> " & M.Value & vbCRLF
Next
Msgbox Str
Msgbox "String after replacing -> " & vbcrlf & reObject.replace(searchString,"Arjun")
Below is the list of all meta characters used in regular expressions in VBScript
\ - indicates that the next character would be a special character, a literal or a backreference
^ - Input String should be matched at the beginning.
$ - Input String should be matched at the end.
* - Matches the preceding character zero or more times. It is same as {0,}.
+ - Matches the preceding character one or more times. It is same as {1,}.
? - Matches the preceding character zero or one time. It is same as {0,1}
{i} - Matches the previous character exactly i times.
{i,} - Matches the previous character at least i times and at most any time.
{i,j} -Matches the previous character at least i times and at the most j times.
. - Matches any single character except "\n".
(pattern) - Matches pattern and captures the match that can be used in backreferences.
p|q - Matches either p or q. Please note that p and q could be more complex regular expressions
[pqr] - A character set. Matches any one of the character inside the brackets.
[^pqr] - A negative character set. Matches any character not inside the brackets.
[p-z] - A range of characters. Matches any character in the specified range i.e p,q,r,....x,y,z.
[^p-z] - A negative range characters. Matches any character not in the specified range i.e. a,b,c...m,n,o
\b - Matches the boundary of the word
\B - Matches middle part of the word.
\d - Matches a digit character. same as [0-9].
\D - Matches a nondigit character. same as [^0-9].
\f , \n and \r - Matches a form-feed character, newline and carriage character.
\s - Matches any white space character including space, tab, form-feed. Equivalent to [ \f\n\r\t\v].
\S - Matches any non-white space character. Equivalent to [^ \f\n\r\t\v].
\t , \v - Matches a horizontal and vertical tab character.
\w - Matches alpha numeric character including underscore. Equivalent to '[A-Za-z0-9_]'.
\W - Matches any non - alpha numeric character. Equivalent to '[^A-Za-z0-9_]'.
\number- A reference back to captured matches.
*********************************************************************************
*********************************************************************************
Some examples on regular expressions:
- To match the 10 digit mobile number -> \d{10}
- To match email address -> \w+@\w+\.\w+
Please give your inputs, suggestions, feedback to Us about above VBScript topic. We value your thoughts.