BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

LINQ
.NET 3.5+

Generating Running Totals with LINQ

A common task when working with sequences of numbers, or collections of objects that include a numeric property, is to generate a running total for those items. This can be achieved using Language-Integrated Query (LINQ) and a closure.

Running Total

When you are working with a sequence that contains numeric values, either as the items in the series or related to each item, you may wish to calculate a series of running totals. This is a common requirement and relatively simple to achieve using a foreach loop. However, it may come as a surprise that there is no LINQ standard query operator that allows you to create running totals, or other rolling summaries, for a sequence.

Although there is no built-in method to generate running totals using LINQ, it is still possible to generate a sequence of summary values in a query by adding a closure. Before the query is defined, a running total variable is created and set to zero. Within the Select operator of the query, the running total is increased by each value in the source sequence, as shown below:

var source = Enumerable.Range(1, 10);

int total = 0;
var runningTotals = source.Select(x => total += x);

/* RESULTS

1, 3, 6, 10, 15, 21, 28, 36, 45, 55

*/

The key elements in the above code are the correct value of the total variable when the query executes, and the lambda expression passed to the Select operator. As the source collection is processed, each iteration increases the total variable and sets the next value in the resultant series. The lambda expression returns the projected value because assignment operators both change the value of a variable and return the new value. In addition, the results are correct because the sequence is processed in the correct order.

The same results can be achieved using query expression syntax. To change the example, simply replace the assignment of the runningTotals variable with the following query:

var runningTotals =
    from x in source
    select total += x;

Limitations

There are several limitations to the above style of coding. Firstly, there is the standard implication of using a closure: if you change the value of the variable that has been closed over before executing the query, the results will be affected. We can show this with the sample code below. Here the query is defined when the total variable is correctly set to zero. However, the Select extension method uses deferred execution so the running totals are not calculated until the sequence is converted to an array. This is after the total value has been incremented, leading to results that are higher than expected.

var source = Enumerable.Range(1, 10);

int total = 0;
var runningTotals = source.Select(x => total += x);

total++;
var results = runningTotals.ToArray();

/* RESULTS

2, 4, 7, 11, 16, 22, 29, 37, 46, 56

*/

The second key limitation with the approach is that the summaries must be generated in the correct order. In most situations this is not a problem. However, if the query is executed against a source sequence that supports parallel operations, such as one modified using .NET 4.0's AsParallel method, the results can be unpredictable.

The following code uses a parallelised source sequence when generating running totals. When executed on a computer with a single processor core, it is likely that the results will be correct. When running on a machine with multiple cores, intermittent bugs are introduced because items may not be accessed in the expected order and due to a race condition when updating the running total variable. The comment in the sample shows one possible, incorrect result.

var source = Enumerable.Range(1, 10);

int total = 0;
var runningTotals = source.AsParallel().Select(x => total += x).ToArray();

foreach (var t in runningTotals)
{
    Console.Write("{0} ", t);
}

/* OUTPUT

1 6 11 17 24 3 28 36 45 55

*/
11 September 2012