Table of contents

Inline comments

Inline comments use the syntax (?#comment). They are an alternative to the line comments allowed in free-spacing mode.

Example code

var regex = XRegExp("(?#month)\\d\\d?[-/. ](?#day)\\d\\d?[-/. ](?#year)(?:\\d\\d){1,2}");
var isDate = regex.test("04/20/2008"); // -> true

Annotations

Named capture

XRegExp includes comprehensive support for named capture. Capture names can use the characters A–Z, a–z, 0–9, _, and $ only.

There are several different syntaxes in the wild for named capture, and although Python was the first to implement the feature, most libraries have adopted .NET's alternative syntax. The following comparison chart includes every regex library with named capture support that I am aware of. It is included to highlight the differences and similarities of named capture in XRegExp and libraries you might already be familiar with. XRegExp's syntax is listed first.

Library Capture Backref in regex Backref in replacement Stored at Backref numbering
XRegExp
  • (?<name>⋯)
  • \k<name>
  • ${name}
result.name Sequential
.NET
  • (?<name>⋯)
  • (?'name'⋯)
  • \k<name>
  • \k'name'
  • ${name}
matcher.Groups('name') Unnamed first, then named
Perl 5.10
  • (?<name>⋯)
  • (?'name'⋯)
  • (?P<name>⋯)
  • \k<name>
  • \k'name'
  • \k{name}
  • \g{name}
  • (?P=name)
  • $+{name}
$+{name} Sequential
PCRE 7 Perl styles N/A Sequential
Python
  • (?P<name>⋯)
  • (?P=name)
  • \g<name>
result.group('name') Sequential
Oniguruma .NET styles
  • \k<name>
  • \k'name'
N/A Unnamed groups default to noncapturing when mixed with named groups
Java 7 beta
  • (?<name>⋯)
  • \k<name>
  • $<name>
matcher.group('name') Sequential
JGsoft .NET and Python styles N/A .NET and Python styles, depending on capture syntax
JRegex
  • ({name}⋯)
  • {\name}
  • ${name}
matcher.group('name') Unknown

Example code

var repeatedWords = XRegExp("\\b (?<word>[a-z]+) \\s+ \\k<word> \\b", "gix");
var input = "The the test data.";

// Check if input contains repeated words
var hasRepeatedWords = repeatedWords.test(input); // -> true

// Use the regex to remove repeated words
var output = input.replace(repeatedWords, "${word}"); // -> "The test data."

var url = "http://yahoo.com/path/to/file?q=1";
var parser = XRegExp("^ (?<scheme> [^:/?]+ ) ://   # aka protocol   \n\
                        (?<host>   [^/?]+  )       # domain name/IP \n\
                        (?<path>   [^?]*   ) \\??  # optional path  \n\
                        (?<query>  .*      )       # optional query   ", "x");

var parts = parser.exec(url);
/* ->
parts: ["http://yahoo.com/path/to/file?q=1", "http", "yahoo.com", "/path/to/file", "q=1"]
parts.scheme: "http"
parts.host: "yahoo.com"
parts.path: "/path/to/file"
parts.query: "q=1"
*/

// Named backreferences available in replacement functions as properties of the first argument
url = url.replace(parser, function (match) {
	return match.replace(match.host, "microsoft.com");
});
// -> "http://microsoft.com/path/to/file?q=1"

Annotations