API

XRegExp(pattern, [flags])

Creates an extended regular expression object for matching text with a pattern. Differs from a native regular expression in that additional syntax and flags are supported. The returned object is in fact a native RegExp and works with all native methods.

Parameters:
  • pattern {String|RegExp}
    Regex pattern string, or an existing regex object to copy.
  • [flags] {String}
    Any combination of flags.
    Native flags:
    • g - global
    • i - ignore case
    • m - multiline anchors
    • y - sticky (Firefox 3+)
    Additional XRegExp flags:
    • n - explicit capture
    • s - dot matches all (aka singleline)
    • x - free-spacing and line comments (aka extended)
    Flags cannot be provided when constructing one RegExp from another.
Returns: {RegExp}
Extended regular expression object.

Example

// With named capture and flag x
XRegExp('(?<year>  [0-9]{4} ) [-\\s]?  # year  \n\
         (?<month> [0-9]{2} ) [-\\s]?  # month \n\
         (?<day>   [0-9]{2} )          # day   ', 'x');

// Providing a regex object copies it. Native regexes are recompiled using native (not
// XRegExp) syntax. Copies maintain special properties for named capture, are augmented
// with `XRegExp.prototype` methods, and have fresh `lastIndex` properties (set to zero).
XRegExp(/regex/);

For details about the regular expression just shown, see Syntax: Named capture and Flags: Free-spacing.

Regexes, strings, and backslashes

JavaScript string literals (as opposed to, e.g., user input or text extracted from the DOM) use a backslash as an escape character. The string literal '\\' therefore contains a single backslash, and its length property's value is 1. However, a backslash is also an escape character in regular expression syntax, where the pattern \\ matches a single backslash. When providing string literals to the RegExp or XRegExp constructor functions, four backslashes are therefore needed to match a single backslash—e.g., XRegExp('\\\\'). Only two of those backslashes are actually passed into the constructor function. The other two are used to escape the backslashes in the string before the function ever sees the string.

The same issue is at play with the \\s sequences in the example code just shown. XRegExp is provided with the two characters \s, which it in turn recognizes as the metasequence used to match a whitespace character. \n (used at the end of the first two lines) is another metasequence in JavaScript string literals and inserts actual line feed characters into the string, which terminate the free-spacing mode line comments that start with #. The backslashes at the very end of the first two lines allow the string to continue to the next line, which avoids the need to concatenate multiple strings when extending a string beyond one line of code.

XRegExp.addToken(regex, handler, [options])

Extends or changes XRegExp syntax and allows custom flags. This is used internally and can be used to create XRegExp addons. XRegExp.install('extensibility') must be run before calling this function, or an error is thrown. If more than one token can match the same string, the last added wins.

Parameters:
  • regex {RegExp}
    Regex object that matches the new token.
  • handler {Function}
    Function that returns a new pattern string (using native regex syntax) to replace the matched token within all future XRegExp regexes. Has access to persistent properties of the regex being built, through this. Invoked with two arguments:
    1. The match array, with named backreference properties.
    2. The regex scope where the match was found.
  • [options] {Object}
    Options object with optional properties:
    • scope {String} Scopes where the token applies: 'default', 'class', or 'all'.
    • trigger {Function} Function that returns true when the token should be applied; e.g., if a flag is set. If false is returned, the matched string can be matched by other tokens. Has access to persistent properties of the regex being built, through this (including function this.hasFlag).
    • customFlags {String} Nonnative flags used by the token's handler or trigger functions. Prevents XRegExp from throwing an 'unknown flag' error when the specified flags are used.
Returns: {undefined}
Does not return a value.

Example

// Basic usage: Add \a for the ALERT control code
XRegExp.addToken(
  /\\a/,
  function () {return '\\x07';},
  {scope: 'all'}
);
XRegExp('\\a[\\a-\\n]+').test('\x07\n\x07'); // -> true

Show more XRegExp.addToken examples. ↓

Addon: XRegExp.build(pattern, subs, [flags])

Requires the XRegExp.build addon.

Builds regexes using named subpatterns, for readability and pattern reuse. Backreferences in the outer pattern and provided subpatterns are automatically renumbered to work correctly. Returns a regex with interpolated subpatterns.

XRegExp.cache(pattern, [flags])

Caches and returns the result of calling XRegExp(pattern, flags). On any subsequent call with the same pattern and flag combination, the cached copy is returned.

Parameters:
  • pattern {String}
    Regex pattern string.
  • [flags] {String}
    Any combination of XRegExp flags.
Returns: {RegExp}
Cached XRegExp object.

Example

while (match = XRegExp.cache('.', 'gs').exec(str)) {
  // The regex is compiled once only
}

var regex1 = XRegExp.cache('.', 's'),
var regex2 = XRegExp.cache('.', 's');
// regex1 and regex2 are references to the same regex object

XRegExp.escape(str)

Escapes any regular expression metacharacters, for use when matching literal strings. The result can safely be used at any point within a regex that uses any flags.

The escaped characters are [, ], {, }, (, ), -, *, +, ?, ., \, ^, $, |, ,, #, and whitespace (see free-spacing for the list of whitespace characters).

Parameters:
  • str {String}
    String to escape.
Returns: {String}
String with regex metacharacters escaped.

Example

XRegExp.escape('Escaped? <.>');
// -> 'Escaped\?\ <\.>'

XRegExp.exec(str, regex, [pos], [sticky])

Executes a regex search in a specified string. Returns a match array or null. If the provided regex uses named capture, named backreference properties are included on the match array. Optional pos and sticky arguments specify the search start position, and whether the match must start at the specified position only. The lastIndex property of the provided regex is not used, but is updated for compatibility. Also fixes browser bugs compared to the native RegExp.prototype.exec and can be used reliably cross-browser.

Parameters:
  • str {String}
    String to search.
  • regex {RegExp}
    Regex to search with.
  • [pos=0] {Number}
    Zero-based index at which to start the search.
  • [sticky=false] {Boolean|String}
    Whether the match must start at the specified position only. The string 'sticky' is accepted as an alternative to true.
Returns: {Array}
Match array with named backreference properties, or null.

Example

// Basic use, with named backreference
var match = XRegExp.exec('U+2620', XRegExp('U\\+(?[0-9A-F]{4})'));
match.hex; // -> '2620'

// With pos and sticky, in a loop
var pos = 2, result = [], match;
while (match = XRegExp.exec('<1><2><3><4>5<6>', /<(\d)>/, pos, 'sticky')) {
  result.push(match[1]);
  pos = match.index + match[0].length;
}
// result -> ['2', '3', '4']

XRegExp.forEach(str, regex, callback, [context])

Executes a provided function once per regex match.

Provides a simpler and cleaner way to iterate over regex matches compared to the traditional approaches of subverting String.prototype.replace or repeatedly calling RegExp.prototype.exec within a while loop. Searches always start at the beginning of the string and continue until the end, regardless of the state of the regex's global property and initial lastIndex.

Parameters:
  • str {String}
    String to search.
  • regex {RegExp}
    Regex to search with.
  • callback {Function}
    Function to execute for each match. Invoked with four arguments:
    1. The match array, with named backreference properties.
    2. The zero-based match index.
    3. The string being traversed.
    4. The regex object being used to traverse the string.
  • [context] {*}
    Object to use as this when executing callback.
Returns: {*}
Provided context object.

Example

// Extracts every other digit from a string
XRegExp.forEach('1a2345', /\d/, function (match, i) {
  if (i % 2) this.push(+match[0]);
}, []);
// -> [2, 4]

XRegExp.globalize(regex)

Copies a regex object and adds flag g. The copy maintains special properties for named capture, is augmented with XRegExp.prototype methods, and has a fresh lastIndex property (set to zero). Native regexes are not recompiled using XRegExp syntax.

Parameters:
  • regex {RegExp}
    Regex to globalize.
Returns: {RegExp}
Copy of the provided regex with flag g added.

Example

var globalCopy = XRegExp.globalize(/regex/);
globalCopy.global; // -> true

function parse(str, regex) {
  var match;
  regex = XRegExp.globalize(regex);
  while (match = regex.exec(str)) {
    // ...
  }
}

XRegExp.install(options)

Installs optional features according to the specified options. Can be undone using XRegExp.uninstall.

Parameters:
  • options {Object|String}
    Options object or string.
Returns: {undefined}
Does not return a value.

Example

// With an options object
XRegExp.install({
  // Overrides native regex methods with fixed/extended versions that support named
  // backreferences and fix numerous cross-browser bugs
  natives: true,

  // Enables extensibility of XRegExp syntax and flags
  extensibility: true
});

// With an options string
XRegExp.install('natives extensibility');

XRegExp.isInstalled(feature)

Parameters:
  • feature {String}
    Name of the feature to check. One of:
    • natives
    • extensibility
Returns: {Boolean}
Whether the feature is installed.

Example

XRegExp.isInstalled('natives');

XRegExp.isRegExp(value)

Returns true if an object is a regex; false if it isn't. This works correctly for regexes created in another frame, when instanceof and constructor checks would fail.

Parameters:
  • value {*}
    Object to check.
Returns: {Boolean}
Whether the object is a RegExp object.

Example

XRegExp.isRegExp('string'); // -> false
XRegExp.isRegExp(/regex/i); // -> true
XRegExp.isRegExp(RegExp('^', 'm')); // -> true
XRegExp.isRegExp(XRegExp('(?s).')); // -> true

XRegExp.matchChain(str, chain)

Retrieves the matches from searching a string using a chain of regexes that successively search within previous matches. The provided chain array can contain regexes and objects with regex and backref properties. When a backreference is specified, the named or numbered backreference is passed forward to the next regex or returned.

Parameters:
  • str {String}
    String to search.
  • chain {Array}
    Regexes that each search for matches within preceding results.
Returns: {Array}
Matches by the last regex in the chain, or an empty array.

Example

// Basic usage; matches numbers within <b> tags
XRegExp.matchChain('1 <b>2</b> 3 <b>4 a 56</b>', [
  XRegExp('(?is)<b>.*?</b>'),
  /\d+/
]);
// -> ['2', '4', '56']

// Passing forward and returning specific backreferences
html = '<a href="http://xregexp.com/api/">XRegExp</a>\
        <a href="http://www.google.com/">Google</a>';
XRegExp.matchChain(html, [
  {regex: /<a href="([^"]+)">/i, backref: 1},
  {regex: XRegExp('(?i)^https?://(?<domain>[^/?#]+)'), backref: 'domain'}
]);
// -> ['xregexp.com', 'www.google.com']

Addon: XRegExp.matchRecursive(str, left, right, [flags], [options])

Requires the XRegExp.matchRecursive addon.

Returns an array of match strings between outermost left and right delimiters, or an array of objects with detailed match parts and position data. An error is thrown if delimiters are unbalanced within the data.

XRegExp.replace(str, search, replacement, [scope])

Returns a new string with one or all matches of a pattern replaced. The pattern can be a string or regex, and the replacement can be a string or a function to be called for each match. To perform a global search and replace, use the optional scope argument or include flag g if using a regex. Replacement strings can use ${n} for named and numbered backreferences. Replacement functions can use named backreferences via arguments[0].name. Also fixes browser bugs compared to the native String.prototype.replace and can be used reliably cross-browser.

For the full details of XRegExp's replacement text syntax, see Syntax: Replacement text.

Parameters:
  • str {String}
    String to search.
  • search {RegExp|String}
    Search pattern to be replaced.
  • replacement {String|Function}
    Replacement string or a function invoked to create it.
    Replacement strings can include special replacement syntax:
    • $$ - Inserts a literal $ character.
    • $&, $0 - Inserts the matched substring.
    • $` - Inserts the string that precedes the matched substring (left context).
    • $' - Inserts the string that follows the matched substring (right context).
    • $n, $nn - Where n/nn are digits referencing an existent capturing group, inserts backreference n/nn.
    • ${n} - Where n is a name or any number of digits that reference an existent capturing group, inserts backreference n.
    Replacement functions are invoked with three or more arguments:
    • The matched substring (corresponds to $& above). Named backreferences are accessible as properties of this first argument.
    • 0..n arguments, one for each backreference (corresponding to $1, $2, etc. above).
    • The zero-based index of the match within the total search string.
    • The total string being searched.
  • [scope='one'] {String}
    Use 'one' to replace the first match only, or 'all'. If not explicitly specified and using a regex with flag g, scope is 'all'.
Returns: {String}
New string with one or all matches replaced.

Example

// Regex search, using named backreferences in replacement string
var name = XRegExp('(?<first>\\w+) (?<last>\\w+)');
XRegExp.replace('John Smith', name, '${last}, ${first}');
// -> 'Smith, John'

// Regex search, using named backreferences in replacement function
XRegExp.replace('John Smith', name, function (match) {
  return match.last + ', ' + match.first;
});
// -> 'Smith, John'

// String search, with replace-all
XRegExp.replace('RegExp builds RegExps', 'RegExp', 'XRegExp', 'all');
// -> 'XRegExp builds XRegExps'

XRegExp.split(str, separator, [limit])

Splits a string into an array of strings using a regex or string separator. Matches of the separator are not included in the result array. However, if separator is a regex that contains capturing groups, backreferences are spliced into the result each time separator is matched. Fixes browser bugs compared to the native String.prototype.split and can be used reliably cross-browser.

Parameters:
  • str {String}
    String to split.
  • separator {RegExp|String}
    Regex or string to use for separating the string.
  • [limit] {Number}
    Maximum number of items to include in the result array.
Returns: {Array}
Array of substrings.

Example

// Basic use
XRegExp.split('a b c', ' ');
// -> ['a', 'b', 'c']

// With limit
XRegExp.split('a b c', ' ', 2);
// -> ['a', 'b']

// Backreferences in result array
XRegExp.split('..word1..', /([a-z]+)(\d+)/i);
// -> ['..', 'word', '1', '..']

XRegExp.test(str, regex, [pos], [sticky])

Executes a regex search in a specified string. Returns true or false. Optional pos and sticky arguments specify the search start position, and whether the match must start at the specified position only. The lastIndex property of the provided regex is not used, but is updated for compatibility. Also fixes browser bugs compared to the native RegExp.prototype.test and can be used reliably cross-browser.

Parameters:
  • str {String}
    String to search.
  • regex {RegExp}
    Regex to search with.
  • [pos=0] {Number}
    Zero-based index at which to start the search.
  • [sticky=false] {Boolean|String}
    Whether the match must start at the specified position only. The string 'sticky' is accepted as an alternative to true.
Returns: {Boolean}
Whether the regex matched the provided value.

Example

// Basic use
XRegExp.test('abc', /c/); // -> true

// With pos and sticky
XRegExp.test('abc', /c/, 0, 'sticky'); // -> false

XRegExp.uninstall(options)

Uninstalls optional features according to the specified options. All optional features start out uninstalled, so this is used to undo the actions of XRegExp.install.

Parameters:
  • options {Object|String}
    Options object or string.
Returns: {undefined}
Does not return a value.

Example

// With an options object
XRegExp.uninstall({
  // Restores native regex methods
  natives: true,

  // Disables additional syntax and flag extensions
  extensibility: true
});

// With an options string
XRegExp.uninstall('natives extensibility');

XRegExp.union(patterns, [flags])

Returns an XRegExp object that is the union of the given patterns. Patterns can be provided as regex objects or strings. Metacharacters are escaped in patterns provided as strings. Backreferences in provided regex objects are automatically renumbered to work correctly. Native flags used by provided regexes are ignored in favor of the flags argument.

Parameters:
  • patterns {Array}
    Regexes and strings to combine.
  • [flags] {String}
    Any combination of XRegExp flags.
Returns: {RegExp}
Union of the provided regexes and strings.

Example

XRegExp.union(['a+b*c', /(dogs)\1/, /(cats)\1/], 'i');
// -> /a\+b\*c|(dogs)\1|(cats)\2/i

XRegExp.version

The XRegExp version number as a string containing three dot-separated parts. For instance, '2.0.0-beta-3'.