A LINQ Style Cumulative Moving Average Operator
When working with series of numeric data that include large fluctuations, it can be difficult to spot trends. One way of processing such a series to make it easier is to apply a cumulative moving average. This article describes a LINQ style extension method to calculate such an average.
Cumulative Moving Average
In a previous article we saw an extension method that behaved like a LINQ operator and calculated a simple moving average. This is one calculation that allows you to see trends that are difficult to spot in raw data series. There are other moving averages that perform similar functions. They can be used alone, or compared with each other, to try to identify trends.
In this article we'll look at the cumulative moving average. Unlike with the simple moving average, each output value is calculated using all of the data points already considered. Each result contains the mean of every item in the raw data so far.
For example, consider the following table, which contains the number of customers that entered a location in each month of the year.
If you were to represent this information in a line graph, you would see that there are two unusual events. In April there were many more customers than normal, whilst in October, customer numbers fell.
By applying a cumulative moving average, the line graph is smoothed to show a trend line, rather than the raw data. The trend is shown below:
Implementing a Cumulative Moving Average
Over the rest of the article we'll create a method that calculates the cumulative moving average for a sequence of double-precision floating-point numbers. You could easily amend the code to process other data types.
To begin, create a new console application. Once the project is ready, add a new class for the extension method named, "CumulativeMovingAverageExtensions". Amend the code in the class to make it public and to generate a static class, as follows:
public static class CumulativeMovingAverageExtensions
We can now add the extension method that calculates the average. This requires a single parameter to receive the input sequence. We need to check that this is not null before calling an iterator method. This allows invalid sequences to be detected immediately but the average to be calculated using deferred execution.
public static IEnumerable<double> CumulativeMovingAverage(this IEnumerable<double> source)
if (source == null) throw new ArgumentNullException("source");
To calculate the average for each datum, the value and all of its predecessors need to be summed, then divided by the number of items processed so far. Rather than remembering every item from the source, it is simpler and more efficient to keep track of the total and the count of items. During each iteration, the count can be incremented and the total increased by the new value. The result for the iteration is the total divided by the count.
Add the implementation method, as follows:
private static IEnumerable<double> CumulativeMovingAverageImpl(IEnumerable<double> source)
double total = 0;
int count = 0;
foreach (double d in source)
total += d;
yield return total / count;
9 August 2015