- Regular Expressions are used to perform pattern-matching and search/replace functions on text using the String and RegExp methods.
- For using the RegExp API, it is necessary to understand the use of patterns of text using regular expression grammar.
- Here, we can clearly see in line 6 that it creates a new RegExp object assigned to the variable pattern.
- This RegExp object will match those strings ending with the letter “s”.
- Also, in line 8, we have alternatively used the RegExp() constructor to create a RegExp object.
- In the above code, s is the character that matches with itself and $ is the meta-character that matches the end of a string.
We will discuss more about the Characters and Meta-characters in the coming section.
A regular expression consists of different characters to match the strings.
Literal Characters include all alphabetic characters and digits that match themselves literally in regular expressions.
Below is a table that lists such characters and their relevant matches in a string.
|Latin character specified by hexadecimal number xxxx
|Unicode character specified by hexadecimal number xxxx
|Unicode character specified by the codepoint n, where n is one to six hexadecimal digits(only used with u flag)
|Control character ^X
These characters can be used in a regular expression using the escape sequences.
However, if we are using the RegExp() constructor, the backslashes in the regular expressions should be doubled since strings also use backslashes as an escape character.
Character Classes are used by combining the individual literal characters within the square brackets.
Below is a table that lists character and their relevant matches in a string.
|One character within the brackets
|One character not within the brackets
|Any character except newline or another Unicode terminator
|Any ASCII word character
|Any non-ASCII word character
|Any Unicode whitespace character
|Any non-Unicode whitespace character
|Any ASCII digit
|Any character except ASCII digit
Also, the special character-class escapes can be used within square brackets.
Repetition specifies the number of times an element of a regular expression may have been repeated.
It is a more complex form of pattern that uses regular expression syntax.
For example, a number having any number of digits or a string of three letters followed by an optional digit.
Below is a table that lists such repetition characters and their relevant matches in a string.
|Match previous item n times at least and m times at most
Match previous item n or more times
|Match previous item exactly n times
Match zero or one occurrences of previous item
|Match one or more occurrences of previous item
Match zero or more occurrences of previous item
We must keep in mind while using the * and ? repetition characters that these characters may match zero instances of whatever precedes them as they are allowed to match nothing.
Flags are used in the regular expression to alter its matching behavior. There can be one or more flags associated with a regular expression.
- In total, there are six possible flags, each represented using a single letter.
- Flags are specified after the second / character of a regular expression literal.
- Flags are specified as a string passed as the second argument to the RegExp() constructor.
In the following section, we are about to discuss the supported flags in the regular expression and their meanings.
The g flag
- This flag indicates that the regular expression is global which means that it can be used to find all the matches within a string instead of just finding the first match.
- The g flag doesn’t alter the way pattern matching is done, but it can alter the behavior of String match() method and RegExp exec() methods.
The i flag
- This flag is intended to use in a case-insensitive pattern matching.
The m flag
- This flag specifies that matching should only be done in a multiline mode. That the RegExp will be used with multiline strings.
- Also, the ^ and $ anchors should match both the beginning and the ending of the string.
The s flag
- Similarly to the m flag, this s flag is intended to be used when working with text that includes newlines.
- Usually, the “.” in a regular expression matches any character except a line terminator.
The u flag
- The u flag stands for Unicode, and it is intended to make the regular expression match with full Unicode codepoints instead of matching the 16-bit values.
- This flag is used to make the RegExp works seamlessly with text having emoji and other characters like Chinese characters that require more than 16-bit values.
The y flag
- This y flag points that the regular expression is sticky and that it should at the beginning of a string or at the first character following the previous match.
- This flag is useful with regular expressions that are used repeatedly to find all matches within a string.
All of these flags can be used with regular expressions in any combination and in any order.