rex

GoDoc

🦖 Basic regex engine written in Go.

Supported syntax:

Feature Syntax Description Example
Dot . (dot) Matches any single character except line break characters. . matches x or any other character.
Alternation | (pipe) Matches either the part on the left side, or the part on the right side. Can be strung together into a series of alternatives. (abc)|(def)|(xyz) matches abc, def or xyz.
Questionmark ? Matches the previous item exactly zero or one time, prefer one. a? matches a and b but not aa.
Plus + Matches the previous item one or more times, prefer more. a+ matches a and aa but not b.
Star * Matches the previous item zero or more times, prefer more. a* matches a, aa and b.
Repetition {n,m} Matches the previous item at least n and at most m times, prefer more. a{1,2} matches a and aa but not aaa.
Repetition {n,} Matches the previous item n or more times, prefer more. a{2,} matches aa and aaa but not a.
Repetition {n} Matches the previous item exactly n times. a{2} matches only aa.
Character class [] All characters except some special characters are literal characters that add themselves to the character class. [abc] matches a, b or c.
Negated character class ^ (caret) immediately after the opening [ Negates the character class, causing it to match a single character not listed in the character class. [^abc] matches any character except a, b, c.
Character range - (hyphen) between two tokens Adds a range of characters to the character class. [a-zA-Z0-9] matches any ASCII letter or digit.
Shorthand character class \s Matches a single whitespace character. Adds all whitespace characters to the character class if used inside a character class. [\s] and/or \s match any single whitespace character.
Negated shorthand character class \S Matches any single character that is not a whitespace. Adds all non-whitespace characters to the character class if used inside a character class. [\S] and/or \S match any single character that is not a whitespace.

Shorthand character classes

Shorthand Description ASCII-Equivalent
\d Digit [0-9]
\D Non-digit [^0-9]
\w Word character [0-9A-Za-z_]
\W Non-word character [^0-9A-Za-z_]
\s Whitespace character [\t\n\v\f\r ]
\S Non-whitespace character [^\t\n\v\f\r ]

Note that [\D\S] is not the same as [^\d\s]. The class [^\d\s] matches any character that is neither a digit nor whitespace. But the class [\D\S] matches any character that is either not a digit, or is not whitespace. So the only character that [\D\S] does not match is a character that is both a digit and a whitespace at the same time. Because all digits are not whitespace, and all whitespace characters are not digits, [\D\S] matches any character; digit, whitespace, or otherwise.

Install

go get github.com/add1609/rex

GitHub

View Github