Regular Expressions For Regular Folk

Escapes

In regex, some characters have special meanings as we will explore across the chapters:

  • |
  • {, }
  • (, )
  • [, ]
  • ^, $
  • +, *, ?
  • \
  • . — Literal only within character classes.1
  • -Sometimes a special character within character classes.

When we wish to match these characters literally, we need to “escape” them.

This is done by prefixing the character with a \.


/\(paren\)/g
  • 0 matchesparen
  • 0 matchesparents
  • 1 match(paren)
  • 1 matcha (paren)
/(paren)/g
  • 1 matchparen
  • 1 matchparents
  • 1 match(paren)
  • 1 matcha (paren)

/example\.com/g
  • 1 matchexample.com
  • 1 matcha.example.com/foo
  • 0 matchesexample_com
  • 0 matchesexample@com
  • 0 matchesexample_com/foo
/example.com/g
  • 1 matchexample.com
  • 1 matcha.example.com/foo
  • 1 matchexample_com
  • 1 matchexample@com
  • 1 matchexample_com/foo

/A\+/g
  • 1 matchA+
  • 1 matchA+B
  • 1 match5A+
  • 0 matchesAAA
/A+/g
  • 1 matchA+
  • 1 matchA+B
  • 1 match5A+
  • 1 matchAAA

/worth \$5/g
  • 1 matchworth $5
  • 1 matchworth $54
  • 1 matchnot worth $5
/worth $5/g
  • 0 matchesworth $5
  • 0 matchesworth $54
  • 0 matchesnot worth $5

Examples

JavaScript in-line comments

/\/\/.*/g
  • 1 matchconsole.log(); // comment
  • 1 matchconsole.log(); // // comment
  • 0 matchesconsole.log();

Asterisk-surrounded substrings

/\*[^\*]*\*/g
  • 1 matchhere be *italics*
  • 1 matchpermitted**
  • 1 matcha*b*c*d
  • 2 matchesa*b*c*d*e
  • 0 matchesa️bcd

The first and last asterisks are literal since they are escaped — \*.

The asterisk inside the character class does not necessarily need to be escaped1, but I’ve escaped it anyway for clarity.

The asterisk immediately following the character class indicates repetition of the character class, which we’ll explore in chapters that follow.


  1. Many special characters that would otherwise have special meanings are treated literally by default inside character classes.