If you want, you can download XRegExp bundled with all addons as “XRegExp-All”: minified, or with comments. Alternatively, you can download the individual addon scripts below. XRegExp's npm package uses xregexp-all.js, which means that the addons are always available when installed via npm.


Adds base support for Unicode matching via the \p{…} syntax. Addon packages for Unicode categories, scripts, blocks, and other properties can then be loaded as needed. All Unicode tokens can be inverted using \P{…} or \p{^…}. Token names are case insensitive, and any spaces, hyphens, and underscores are ignored. You can omit the braces for token names that are a single letter.


// Categories
XRegExp('\\p{Sc}\\pN+'); // Sc: currency symbol, N: number

// Scripts

// Blocks (use 'In' prefix)
XRegExp('\\P{InPrivateUseArea}'); // Uppercase \P for negation
XRegExp('\\p{^InMongolian}'); // Alternate negation syntax

// Properties

To activate this addon, simply load it after loading XRegExp:

<script src="xregexp.js"></script>
<script src="addons/unicode-base.js"></script>
<script src="addons/unicode-categories.js"></script>
  var unicodeWord = XRegExp("^\\pL+$");

  unicodeWord.test("Русский"); // true
  unicodeWord.test("日本語"); // true
  unicodeWord.test("العربية"); // true

<!-- Categories, scripts, blocks, and properties are packaged separately so you
can load just the data you need -->
<script src="addons/unicode-scripts.js"></script>
  XRegExp("^\\p{Katakana}+$").test("カタカナ"); // true

By default, \p{…} and \P{…} support the Basic Multilingual Plane (i.e. code points up to U+FFFF). You can opt-in to full 21-bit Unicode support (with code points up to U+10FFFF) on a per-regex basis by using flag A. In XRegExp, this is called astral mode. You can automatically add flag A for all new regexes by running XRegExp.install('astral'). When in astral mode, \p{…} and \P{…} always match a full code point rather than a code unit, using surrogate pairs for code points above U+FFFF.

// Using flag A to match astral code points
XRegExp('^\\pS$').test('💩'); // -> false
XRegExp('^\\pS$', 'A').test('💩'); // -> true
XRegExp('(?A)^\\pS$').test('💩'); // -> true
// Using surrogate pair U+D83D U+DCA9 to represent U+1F4A9 (pile of poo)
XRegExp('(?A)^\\pS$').test('\uD83D\uDCA9'); // -> true

// Implicit flag A
XRegExp('^\\pS$').test('💩'); // -> true

Opting in to astral mode disables the use of \p{…} and \P{…} within character classes. In astral mode, use e.g. (\pL|[0-9_])+ instead of [\pL0-9_]+.

XRegExp 3.0.0 uses Unicode 8.0.0.


Download XRegExp.matchRecursive.

See API: XRegExp.matchRecursive.


See API: