BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Regular Expressions
.NET 1.1+

Regular Expressions in .NET

This is the first article in a series that describes the use of regular expressions with the .NET framework and C#. Regular expressions define the rules for pattern matching in strings. They are useful for finding and replacing text, and for validating the format of strings.

What are Regular Expressions?

Two common problems you may need to solve when processing strings are validating that an item matches a given pattern, and finding substrings that match a pattern and extracting them from a larger string or text document. For example, you might wish to validate that an email address, URL, IP address or telephone number is correctly formatted. You might also wish to extract such information from a large piece of text.

Regular expressions provide a standardised language for defining patterns. They use a string made from alphanumeric characters and symbols. You then use a regular expressions engine to compare the regular expression with another piece of text. Depending upon the engine that you use, you can determine whether there is a match, extract one or matches from the source text or even replace the matches. When replacing, the new text can include elements from the original matches.

Regular expressions engines are very powerful. However, the regular expressions themselves can quickly become very complex and, therefore, difficult to read. If you have never used regular expressions, finding one in another programmer's code can be confusing. Due to the complexity, they can also be slow to process. They should, therefore, be used only when simpler and faster options are unavailable.

The .NET framework includes a regular expressions engine based primarily around the Regex class but supported by other types that are defined in the System.Text.RegularExpressions namespace. In this tutorial we will explore both the syntax you can use to create regular expressions and the features of the .NET engine.

Before we delve into regular expressions, let's see a simple example. In the automatically generated class file within a new console application project, add a using directive for the namespace:

using System.Text.RegularExpressions;

Add the code below within the Main method of the program. The code begins by creating a source string containing the details of some servers in a network. The call to Regex.Matches uses a regular expression to find every piece of text in the source string that resembles an IP address.

string source = @"
    Intranet Servers:    192.168.0.10/192.168.0.11/192.168.0.12
    SQL Server:          192.168.0.20
    Integration Servers: 192.168.0.30 (MSMQ)
                         192.168.0.40 (Services)";

var ips = Regex.Matches(source, @"\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}");

foreach (var ip in ips)
{
    Console.WriteLine(ip);
}

/* Output

192.168.0.10
192.168.0.11
192.168.0.12
192.168.0.20
192.168.0.30
192.168.0.40

*/

In the example, the regular expression is somewhat naive. It actually looks for text that is made from four numbers, each between one and three digits in length, separated by full stops or periods. This means that it would find invalid IP addresses, such as 999.999.999.999. It is possible to create a more complex regular expression that only matches valid IP addresses using the tools and syntax that we will see in future articles in this tutorial.

29 August 2015