Start of String Anchor
If you only wish to find matches at the start of the text, you can use the anchor, "\A". This is demonstrated in the code below. Note that although the Multiline option is still in use, only the first match is returned.
string input = @"This bit is at the start of the string.
This bit is at the start of a line.
However, this bit is in the middle of the string.
And this bit is at the end of the string.";
foreach (Match match in Regex.Matches(input, @"\A[Tt]his", RegexOptions.Multiline))
{
Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}
/* OUTPUT
Matched 'This' at index 0
*/
End of String Anchor
To match text at the end of the source text only, you can use the anchor, "\Z". The match must be at the end of the string, or immediately before a final line feed character. As with the end of line anchor, when working with strings created with the .NET framework, you should check for an optional carriage return, as demonstrated in the sample code below:
string input = @"This bit is at the start of the string.
This bit is at the start of a line.
However, this bit is in the middle of the string.
And this bit is at the end of the string.";
foreach (Match match in Regex.Matches(input, @"string.\Z", RegexOptions.Multiline))
{
Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}
/* OUTPUT
Matched 'string.' at index 163
*/
If you modify the anchor to use a lower case version, "\z", the match must appear at the end of the string only. If it is followed by an extra line feed, the pattern will not be matched.
string input = @"This bit is at the start of the string.
This bit is at the start of a line.
However, this bit is in the middle of the string.
And this bit is at the end of the string.";
foreach (Match match in Regex.Matches(input, @"string.\z", RegexOptions.Multiline))
{
Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}
/* OUTPUT
Matched 'string.' at index 163
*/
Word Boundary Anchor
A commonly used item is the word boundary anchor (\b). This matches the position where a word starts or end. Words are defined as groups of contiguous word characters, which include letters, numeric digits and underscores.
The following code matches the text, "Can" or "can" where it appears at the start of a word. This means that the can in "toucan" and the second can in "cancan" are not matched.
string input = "Can the cantankerous toucan dance the cancan?";
foreach (Match match in Regex.Matches(input, @"\b[Cc]an"))
{
Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}
/* OUTPUT
Matched 'Can' at index 0
Matched 'can' at index 8
Matched 'can' at index 38
*/
Non-Word Boundary Anchor
The non-word boundary anchor is the opposite of the word boundary anchor. It ensures that the matching position is either between two word characters or two non-word characters. The anchor is specified using "\B".
Try running the following code. This matches the text, "Can" or "can" where it is not found at the start of a word.
string input = "Can the cantankerous toucan dance the cancan?";
foreach (Match match in Regex.Matches(input, @"\B[Cc]an"))
{
Console.WriteLine("Matched '{0}' at index {1}", match.Value, match.Index);
}
/* OUTPUT
Matched 'can' at index 24
Matched 'can' at index 41
*/
19 September 2015