.

regular expressions in c#

Characters

identifier

definition

\a

Alert, x07.

\b

Backspace, x08.

\e

ESC, x1B.

\n

Newline, x0A.

\r

Carriage return, x0D.

\f

Form feed, x0C.

\t

Tab, x09.

\v

Vertical tab, x0B.

\0octal

Two-digit octal character code.

\xhex

Two-digit hexadecimal character code.

\uhex

Four-digit hexadecimal character code.

\cchar

Named control character.

Classes

identifier

definition

[...]

Single character from the listed range.

[^...]

Single character not from the listed range.

.

Any single character except for a line terminator (unless in single-line mode – s).

\w

Word character such as [a-zA-Z_0-9]

\W

Non-word character - basically [^a-zA-Z_0-9]

\d

Digit [0-9]

\D

Non-digit [^0-9]

\s

Whitespace character such as [ \f\n\r\t\v]

\S

Non-whitespace character

\p{prop}

Character that is contained in the specified Unicode block / property.

\P{prop}

Character that is not contained in the specified Unicode block / property.

Tests etc

identifier

definition

^

Start of a string, or following a newline in MULTILINE mode.

\A

Beginning of string, in all match modes.

$

End of string or before any newline if in MULTILINE mode.

\Z

End of string but before any final line terminator, in all match modes.

\z

End of string in all match modes.

\b

Boundary between a \w character and a \W character.

\B

Not-word-boundary.

\G

End of the previous match.

(?=...)

Positive lookahead.

(?!...)

Negative lookahead.

(?<=...)

Positive lookbehind.

(?<!...)

Negative lookbehind.

mode modifiers

mode

identifier

definition

Singleline

s

Dot (.) matches any character, including a line terminator.

Multiline

m

^ and $ match next to embedded line terminators.

IgnorePatternWhitespace

x

Ignore whitespace and allow embedded comments starting with #.

IgnoreCase

i

Case-insensitive match based on characters in the current culture.

CultureInvariant

i

Culture-insensitive match.

ExplicitCapture

n

Allow named capture groups, but treat parentheses as non-capturing groups.

Compiled

 

Compile regular expression.

RightToLeft

 

Search from right to left, starting to the left of the start position.

ECMAScript

 

Enables ECMAScript compliance when used with IgnoreCase or Multiline.

(?imnsx-imnsx)

 

Turn match flags on or off for rest of pattern.

(?imnsx-imnsx:...)

 

Turn match flags on or off for the rest of the subexpression.

(?#...)

 

Treat substring as a comment.

#...

 

Treat rest of line as a comment in /x mode.

groups and repititions

identifier

definition

(...)

Grouping. Submatches fill \1,\2,… and $1, $2,….

\n

In a regular expression, match what was matched by the nth earlier submatch.

$n

In a replacement string, contains the nth earlier submatch.

(?<name>)

Captures matched substring into group, name.

(?:...)

Grouping-only parentheses, no capturing.

(?>...)

Disallow backtracking for subpattern.

...|...

Alternation; match one or the other.

*

Repeated 0 or more times.

+

Repeated 1 or more times.

?

Repeated 1 or 0 times.

{n}

Repeated exactly n times.

{n,}

Repeated at least n times.

{x,y}

Repeated at least x times, but no more than y times.

*?

Repeated 0 or more times, but as few times as possible.

+?

Repeated 1 or more times, but as few times as possible.

??

Repeated 0 or 1 times, but as few times as possible.

{n,}?

Repeated at least n times, but as few times as possible.

{x,y}?

Repeated at least x times, no more than y times, but as few times as possible.

Replacements

identifier

definition

$1, $2, ...

The captured submatches.

${name}

The matched text for a named capture group.

$

Text before match.

$&

Text of match.

$

Text after match.

$+

Last parenthesized match.

$_

Original input string.

 

What's your thoughts on this?

*

Protected by WP Anti Spam