BlackWaspTM

This web site uses cookies. By using the site you accept the cookie policy.This message is for compliance with the UK ICO law.

Parallel and Asynchronous
.NET 4.0+

Parallel For Loop

The second part of the Parallel Programming in .NET tutorial examines the parallel for loop. This allows the execution of a specific number of loop iterations in parallel, with data decomposition handled automatically by the Task Parallel Library.

Parallel Loops

The Task Parallel Library (TPL) includes two loop commands that are parallel versions of the for and foreach looping structures of C#. They each provide the code needed for the Parallel Loop Pattern, ensuring that the entire process is completed with all iterations executed before moving on to the statement following the loop. The individual iterations are decomposed into groups that may be divided between the available processors, increasing performance on machines with multiple cores.

Parallel.For

In this article we will consider the parallel for loop. This provides some of the functionality of the basic for loop, allowing you to create a loop with a fixed number of iterations. If multiple cores are available, the iterations can be decomposed into groups that are executed in parallel.

To demonstrate, create a new console application. The parallel loops are found in the System.Threading.Tasks namespace so add the following using directive to the generated class:

using System.Threading.Tasks;

To begin, we can create a sequential loop. In the code below, the loop iterates ten times, with the loop control variable increasing from zero to nine. In each iteration the GetTotal method is called. This performs a calculation that is included to generate a long enough pause to see the performance improvement of the parallel version.

When you run the program it outputs the iteration number from the loop control variable and the result of the calculation. NB: You may wish to adjust the length of the loop in the GetTotal method to achieve a useful pause between iterations.

static void Main()
{
    for (int i = 0; i < 10; i++)
    {
        long total = GetTotal();
        Console.WriteLine("{0} - {1}", i, total);
    }
}

static long GetTotal()
{
    long total = 0;
    for (int i = 1; i < 1000000000; i++)    // Adjust this loop according
    {                                       // to your computer's speed
        total += i;
    }
    return total;
}

/* OUTPUT

0 - 499999999500000000
1 - 499999999500000000
2 - 499999999500000000
3 - 499999995000000000
4 - 499999995000000000
5 - 499999995000000000
6 - 499999995000000000
7 - 499999995000000000
8 - 499999995000000000
9 - 499999995000000000

*/

To convert the above loop into a parallel version, we can use the Parallel.For method. The syntax is different as it is provided by a static method, rather than a C# keyword. The version of the method that we are interested in has three parameters. The first two arguments specify the lower and upper bounds of the loop, with the upper bound being exclusive. The third parameter accepts an Action delegate, usually expressed as a lambda expression, that contains the code to run during each iteration.

The parallel syntax for the previous loop is shown below. When you run the above code on a computer with multiple cores, you should see a considerable improvement in performance. On a single core, single processor system the performance will be marginally slower than the equivalent sequential loop.

Parallel.For(0, 10, i =>
{
    long total = GetTotal();
    Console.WriteLine("{0} - {1}", i, total);
});

/* OUTPUT

5 - 499999995000000000
1 - 499999995000000000
6 - 499999995000000000
0 - 499999995000000000
2 - 499999995000000000
7 - 499999995000000000
4 - 499999995000000000
3 - 499999995000000000
8 - 499999995000000000
9 - 499999995000000000

*/

It is important to note that the output for the parallel version is different from that of its sequential counterpart. The results shown in the comments above were achieved using a dual-core processor. In this case iteration '5' completed first and what would have been the first iteration in the sequential version actually ran fourth. This change to the ordering of the loop almost always happens when running in parallel and can cause problems if unanticipated.

Pitfalls

Although it is simple to replace a sequential loop with a parallel version, you must take care when doing so. There are various pitfalls that can be encountered. Some cause immediately obvious bugs in your code. Some cause subtle bugs that may occur only rarely and are difficult to find. Others simply lower the performance of parallel loops.

Some of the pitfalls are described in the remainder of this article. Later in the tutorial we will see some ways in which these problems can be remedied.

15 August 2011