BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

LINQ
.NET 3.5+

LINQ Set Operations

The twelfth part of the LINQ to Objects tutorial considers the set operations that are provided by Language Integrated Query (LINQ). Four standard query operators are available that return distinct values from a sequence or combine data from two sets.

Set Operations

The Language Integrated Query libraries provide standard query operators for four set operations. Each produces an enumerable sequence of values, based upon filtering the information found in one collection or combining the data held in two separate collections into a single result set. In this article we will examine the four operators and show examples of their use.

Set Standard Query Operators

To demonstrate the four set operations we need some sample data. Although the four standard query operators can be used with any data type, including complex classes and structures, we will use two arrays of strings. This will allow the sample code to be kept simple and make the results easier to understand.

var set1 = new string[] { "A", "a", "B", "C", "D", "D", "E" };
var set2 = new string[] { "a", "B", "C", "D", "E", "e", "F" };

Distinct

The first demonstration uses the simplest of the set operation standard query operators. This is the Distinct extension method. It is used to generate a list of unique items from a single collection, filtering out any duplicate data. As with all of the set operators, this method does not perform the process immediately. Instead, the return value is an IEnumerable<T> that contains all of the information required to generate the desired sequence using deferred execution. This result can be combined with other standard query operators or queries created with query expression syntax, for efficient execution when the contents of the enumerable set are first read.

The sample code below finds all of the distinct values from the first example array. Note that the duplicated D has been removed but the two A's are still present because one is lower case and the other is capitalised.

var distinct1 = set1.Distinct(); // A,a,B,C,D,E

The above sample uses the default comparer for the data type being processed. You can use an alternative comparer by providing it as a second parameter. The comparer must implement the IEqualityComparer<T> interface. Below the comparison is case-insensitive and the lower case letter A has been removed from the results accordingly.

var distinct1 = set1.Distinct(StringComparer.OrdinalIgnoreCase); // A,B,C,D,E

Union

The Union operator allows the contents of two collections to be combined into a single resultant list. If any duplicated items are identified, they are removed. This gives the same results as executing the Distinct method against the results of a LINQ concatenation operation.

To use the Union operator, the method is executed against one collection, passing the second collection to the method's parameter:

var union = set1.Union(set2); // A,a,B,C,D,E,e,F

As with Distinct, an alternative comparer can be used. This should be provided as a second argument, as in the following sample:

var union = set1.Union(set2, StringComparer.OrdinalIgnoreCase); // A,B,C,D,E,F

Intersect

The third set operator is Intersect. This method is executed against two collections, one as the subject of the method and one provided using an argument. It returns all of the items that appear in both collections. If a value is present in only one of the two sequences then it will be omitted from the results.

var intersection = set1.Intersect(set2); // a,B,C,D,E

Again, we can provide an alternative comparer as a second argument. In this case using a case-insensitive comparer results in a capital A being added to the results. This is because it is matched to the lower case A in the second list. No lower case A is included as it is considered to be a duplicate.

var intersection = set1.Intersect(set2, StringComparer.OrdinalIgnoreCase); // A,B,C,D,E

Except

The final set operator is Except. This extension method compares the items in two collections. All items that appear in the first sequence but are not present in the second are returned in the resultant list. The following code returns all items that appear in set1 but not in set2. There is one such item.

var except = set1.Except(set2); // A

As you may expect, the Except method can be used with an alternative comparer. As you can see from the code below, the use of a case-insensitive comparer means that there are no items in the first sequence that are not matched in the second. The result is, therefore, an empty set.

var except = set1.Except(set2, StringComparer.OrdinalIgnoreCase); // empty
21 September 2010