XRegExp(pattern, [flags])Accepts a pattern and flags; returns a new, extended RegExp object. Differs from a native regular expression in that additional syntax and flags are supported and cross-browser regex syntax inconsistencies are ameliorated.
| Parameters: |
|
|---|---|
| Returns: |
|
var regex = XRegExp("(?<month> [0-9]+ ) [-/.\\s] # month\n\
(?<day> [0-9]+ ) [-/.\\s] # day \n\
(?<year> [0-9]+ ) # year ", "x");
var input = "04/20/2009";
input.replace(regex, "${year}-${month}-${day}"); // "2009-04-20"
var match = regex.exec(input);
match.month; // "04"
regex instanceof RegExp; // true
regex.constructor == RegExp; // true
For more information about named capture, see New syntax: Named capture; and see New flags: Free-spacing and line comments for details about the x flag.
JavaScript string literals (as opposed to, e.g., user input or text extracted from the DOM) use a backslash as an escape character. The string literal "\\" therefore contains a single backslash, and its length property value is 1. However, a backslash is also an escape character in regular expression syntax, where the pattern \\ matches a single backslash. When providing string literals to the RegExp or XRegExp constructor functions, four backslashes are therefore needed to match a single backslash, e.g., XRegExp("\\\\"). Only two of those backslashes are actually passed into the constructor function. The other two are used to escape the backslashes in the string before the function ever sees the string.
The same issue is at play with the \\s sequences in the example code just shown. XRegExp is provided with the two characters \s, which it in turn recognizes as the metasequence used to match a whitespace character. \n (used at the end of the first two lines) is another metasequence in JavaScript string literals and inserts actual line feed characters into the string, which terminate the free-spacing mode line comments that start with #. The backslashes at the very end of the first two lines allow the string to continue to the next line, which avoids the need to concatenate multiple strings when extending a string beyond one line of code.
XRegExp.addToken(regex, handler, [scope], [trigger])Provides a means to create custom flags and extend or change the regular expression language accepted by XRegExp. This function is used internally for XRegExp's own syntax extensions and can be used to create XRegExp plugins.
This function is intended for users with advanced knowledge of JavaScript's regular expression syntax and behavior. Beginning regexers may want to stick to plugins or provided examples that take advantage of this method. To disable further changes to XRegExp syntax, run XRegExp.freezeTokens() after loading XRegExp and any plugins.
| Parameters: |
|
|---|---|
| Returns: |
|
The handler and trigger functions have access to special properties (accessed through this) that apply to the regular expression being compiled. Any data stored on the this object persists during the XRegExp construction process. The this.hasFlag(flag) method is always available, and returns a boolean indicating whether the regex has the provided, single-character flag. It can be used, e.g., within the trigger function to add support for new flags that change the interpretation of regex syntax.
Added tokens do not cascade. If more than one token can match the same string, the one added last wins. Thus, you can add a generic token (e.g., \\p{[^}]+}), then follow it with more specific tokens that sometimes override it (e.g., \\p{L}).
// Many regex flavors support \a for matching the bell control character.
// JavaScript does not, so lets add it
XRegExp.addToken(
/\\a/,
function () {return "\\x07"},
XRegExp.INSIDE_CLASS | XRegExp.OUTSIDE_CLASS
);
XRegExp("\\a[\\a-\\n]+").test("\x07\x0A\x07"); // true
// Add support for escape sequences: \Q⋯\E and \Q⋯
XRegExp.addToken(
/\\Q([\s\S]*?)(?:\\E|$)/,
function (match) {return XRegExp.escape(match[1])},
XRegExp.INSIDE_CLASS | XRegExp.OUTSIDE_CLASS
);
XRegExp("^\\Q({?*+})").test("({?*+})"); // true
// Add the U (ungreedy) flag from PCRE and RE2, which reverses greedy and lazy quantifiers
XRegExp.addToken(
/([?*+]|{\d+(?:,\d*)?})(\??)/,
function (match) {return match[1] + (match[2] ? "" : "?")},
XRegExp.OUTSIDE_CLASS,
function () {return this.hasFlag("U")}
);
XRegExp("a+").exec("aaa")[0]; // "aaa"
XRegExp("a+?").exec("aaa")[0]; // "a"
XRegExp("a+", "U").exec("aaa")[0]; // "a"
XRegExp("a+?", "U").exec("aaa")[0]; // "aaa"
// Add support for POSIX character classes (e.g., [[:alpha:]])
XRegExp.addToken(
/\[:([a-z\d]+):]/i,
function () {
var posix = {
alnum: "A-Za-z0-9",
alpha: "A-Za-z",
ascii: "\\0-\\x7F",
blank: " \\t",
cntrl: "\\0-\\x1F\\x7F",
digit: "0-9",
graph: "\\x21-\\x7E",
lower: "a-z",
print: "\\x20-\\x7E",
punct: "!\"#$%&'()*+,\\-./:;<=>?@[\\\\\\]^_`{|}~",
space: " \\t\\r\\n\\v\\f",
upper: "A-Z",
word: "A-Za-z0-9_",
xdigit: "A-Fa-f0-9"
};
return function (match) {
if (!posix[match[1]])
throw SyntaxError(match[1] + " is not a valid POSIX character class");
return posix[match[1]];
};
}(),
XRegExp.INSIDE_CLASS
);
XRegExp("^[[:xdigit:][:space:]]+$").test("00A9 1B7F"); // true
Tokens can be more complex than the examples just shown. See custom token examples for more ideas.
XRegExp.cache(pattern, [flags])Accepts a pattern and flags; returns an extended RegExp object. If the pattern and flag combination has previously been cached, the cached copy is returned, otherwise the new object is cached.
| Parameters: |
|
|---|---|
| Returns: |
|
while (XRegExp.cache(".", "gs").test(subject)) {
// The regex is only compiled once
...
}
var regex1 = XRegExp.cache("\\b ex", "gix");
var regex2 = XRegExp.cache("\\b ex", "gix");
// regex1 and regex2 are now references to the same RegExp object
XRegExp.copyAsGlobal(regex)Accepts a RegExp instance; returns a copy with the g (global) flag set. This allows the regex to be used in while loops (for match iteration), etc. The copy has a fresh lastIndex (set to zero).
Note that if you want to copy a regex without forcing the global property, you can use XRegExp(regex).
| Parameters: |
|
|---|---|
| Returns: |
|
function parse (str, regex) {
var match;
regex = XRegExp.copyAsGlobal(regex);
while (match = regex.exec(str)) {
...
}
}
XRegExp.escape(string)Accepts a string; returns the string with regex metacharacters escaped. The returned string can safely be used at any point within a regex to match the provided literal string. The escaped characters are [, ], {, }, (, ), -, *, +, ?, ., \, ^, $, |, ,, #, and whitespace (see free-spacing for the list of whitespace characters).
| Parameters: |
|
|---|---|
| Returns: |
|
var str = XRegExp.escape("escaped? [x]");
var regex = XRegExp(str); // RegExp would work identically here
regex.test("escaped? [x]"); // true
regex.source == "escaped\\?\\ \\[x\\]"; // true
XRegExp.execAt(string, regex, [pos], [anchored])Accepts a string to search, regex to search with, optional position to start the search within the string, and an optional boolean indicating whether matches must start at-or-after the position or at the specified position only. This function ignores the lastIndex property of the provided regex.
| Parameters: |
|
|---|---|
| Returns: |
|
var str = "Result: 25."; XRegExp.execAt(str, /\d+/); // returns ["25"] with the index property set to 8 XRegExp.execAt(str, /\d+/, 0); // returns ["25"] with the index property set to 8 XRegExp.execAt(str, /\d+/, 10); // returns null XRegExp.execAt(str, /\d+/, 0, true); // returns null XRegExp.execAt(str, /\d+/, 8, true); // returns ["25"] with the index property set to 8
XRegExp.freezeTokens()Breaks the unrestorable link to XRegExp's private list of tokens, and thereby prevents the addition of new syntax and flags. After freezeTokens runs, attempts to call XRegExp.addToken() will throw an error.
This function should be run after XRegExp and any plugins are loaded.
| Parameters: |
|
|---|---|
| Returns: |
|
<script src="xregexp-min.js"></script> <script>XRegExp.freezeTokens();</script>
XRegExp.isRegExp(object)Accepts any value; returns a boolean indicating whether the argument is a RegExp object (note that this is also true for regex literals and regexes created using the XRegExp constructor). This function works correctly with variables created in another frame, when instanceof and regex.constructor checks would fail to work as intended.
| Parameters: |
|
|---|---|
| Returns: |
|
XRegExp.isRegExp("string"); // false
XRegExp.isRegExp(/regex/i); // true
XRegExp.isRegExp(RegExp("^", "gm")); // true
XRegExp.isRegExp(XRegExp(".", "s")); // true
function search (string, searchValue) {
if (!XRegExp.isRegExp(searchValue)) {
regex = XRegExp(XRegExp.escape(searchValue));
}
...
}
XRegExp.iterate(string, regex, callback, [context])Executes the provided callback function once per match of regex within string. This provides a simpler and cleaner way to iterate over regex matches compared to the traditional approaches of subverting String.prototype.replace or repeatedly calling RegExp.prototype.exec within a while loop.
| Parameters: |
|
|---|---|
| Returns: |
|
Searches always start at the beginning of the string and continue until the end, regardless of the state of the regex's global property and initial lastIndex.
// populate an array with match objects
var matches = [];
XRegExp.iterate(str, regex, function (match) {
matches.push(match);
});
// extract every other digit from a string
matches = [];
XRegExp.iterate("1a2345", /\d/, function (match, i) {
if (i % 2) matches.push(+match[0]);
});
// matches: [2, 4]
// count the occurences of each word in a string, providing an object as the context argument
var words = {};
XRegExp.iterate("Run Forrest, run.", XRegExp("(?<word>\\w+)"), function (match) {
var word = match.word.toLowerCase();
if (!this[word]) this[word] = 0;
this[word]++;
}, words);
// words: {run: 2, forrest: 1}
XRegExp.matchChain(string, regexes)Accepts a string to search and an array of regexes; returns the result of using each successive regex to search within the matches of the previous regex. The array of regexes can also contain objects with regex and backref properties, in which case the named or numbered backreferences specified are passed forward to the next regex or returned.
| Parameters: |
|
|---|---|
| Returns: |
|
var str = "1 <b>2</b> 3 <b>4 5</b>";
var result = XRegExp.matchChain(str, [/<b>.*?<\/b>/i, /\d+/]);
// result: ["2", "4", "5"]
str = '<img src="http://x.com/img.png">' +
'<script src="http://google.com/path/file.ext">' +
'<img src="http://google.com/path/to/img.jpg?x">' +
'<img src="http://google.com/img2.gif"/>';
result = XRegExp.matchChain(str, [
// match <img> tags; pass forward attributes in backref 1
{regex: /<img\b([^>]+)>/i, backref: 1},
// match src attributes; pass forward attribute values in backref "src"
{regex: XRegExp('(?ix) \\s src=" (?<src> [^"]+ )'), backref: "src"},
// match URLs with host "google.com"; pass forward URL paths in backref 1
{regex: XRegExp("^https?://google\\.com(/[^#?]+)", "i"), backref: 1},
// match and pass forward/return filenames (strip directory paths)
/[^\/]+$/
]);
// result: ["img.jpg", "img2.gif"]
XRegExp.versionHolds the version number of XRegExp as a string containing three integers separated by dots—e.g., "1.0.0".
RegExp.prototype.apply(context, args)Returns the result of calling RegExp.prototype.exec with the first value in the args array. This is intended to allow working generically with both functions and regular expressions.
| Parameters: |
|
|---|---|
| Returns: |
|
// Return an array with the elements of the existng array for which
// the provided filtering function returns true
Array.prototype.filter = function (fn, context) {
var results = [];
for (var i = 0; i < this.length; i++) {
if (fn.apply(context, [this[i], i, this])) {
results.push(this[i]);
}
}
return results;
};
var output = ["a", "ba", "ab", "b"].filter(/^a/); // ["a", "ab"]
RegExp.prototype.call(context, string)Returns the result of calling RegExp.prototype.exec with the provided string. This is intended to allow working generically with both functions and regular expressions.
| Parameters: |
|
|---|---|
| Returns: |
|
function validate (str, validators) {
for (var i = 0; i < validators.length; i++) {
if (!validators[i].call(null, str)) {
return false;
}
}
return true;
}
// Validate that the string contains at least 1 special character and has 8 or more characters.
// The length-checking function could be replaced with the regex /^[\s\S]{8}/.
validate("password!",
[ function (str) {return str.length >= 8},
/[\W_]/ ]); // true