As we all know the chip industry changed it's approach to performances increment: we will not see 10Ghz processors (not even 5Ghz ones), but now we have duo-core, quad-core and 8-core or 16-core processors coming in a near future.

But this is a problem for software developers: while it's easy to take advantage of a 10Ghz processor (just do what you always did, and your application will go faster because it will be run by a faster processor), it's more difficult to take advantage of four 2.4Ghz cores inside the same processor.

A few years ago, Volker Will said:

But how will users benefit from multiple cores? Will the apps run faster just because there a now 2 processors on a single chip? I guess not really. There are benefits for the OS that may relate to improved performance. But the app itself? Well, you can run multiple instances easier and better for one. But what about a single app? A single threaded (client) app that has been designed with a single processor and a single thread of execution in mind, will not benefit and therefore users will not benefit from multiple processors or multiple cores.

Which means that even if increasing the number of cores, if the OS is designed to handle them, will increase the overall performances of the computer (some apps running on one core, and others on the second), a single application that has not been designed with parallelism in mind will not benefit from multi cores processors.

But exploiting the advantages of a multi-core chip is not easy for all tasks: rendering, encoding, or scientific applications are the ones that are easier to modify to support parallelism, but what about the usual web pages or the usual WinForm applications that reads data from a DB, and than display them to the users? This is not as easy as parallelizing rendering or mathematical application, but it's something you have to start thinking about. Jeff Atwood a few years ago said:

One day, you won't be able to throw money at your hardware to make your app run faster. You'll have no choice but to pour that money into parallelizing the algorithms inside your app, which is a far more difficult proposition.

What can you do now to take advantage of multi-core chips? Try to use as much as possible async operations and threading even for something that you usually build as sequential operations, launch DB queries in parallel and wait for the results of both before going on with the page processing.

But to help the process of shifting to a more parallel way of programming, Microsoft Research developed a library to help all of us to easily exploit multi-cores chip: Task Parallel Library (TPL), which is part of the Parallel FX Library, which is going to be released in CTP in Fall '07.

I found about that in an article from the October 07 edition of the MSDN Magazine: Parallel Performance - Optimize Managed Code For Multi-Core Machines.

It is not another article about how to use the ThreadPool API (which is covered in another article on the same edition of the MSDN magazine), but it explains this new library designed to "automatically" optimize your managed application for multi-cores.

Let's see an example:

for (int i = 0; i < 100; i++) { 
  a[i] = a[i]*a[i]; 
}

Each iteration of the loop is independent, so theoretically you could optimize the for loop splitting it into 2 or 4 parallel tasks, each of them looping on different ranges (for example 1 to 50 and 51 to 100).

Using the TPL you could express the same for loop like this:

Parallel.For(0, 100, delegate(int i) { 
  a[i] = a[i]*a[i]; 
});

The library uses some advanced algorithms to split the task among the available cores, dynamically adapting to the workload and to the particular machine. For example on single-processor machines the loop will be executed sequentially, but on dual-core machines the library will use two worker threads.

The article then explains also all the other features of the library, so I high encourage everybody interested in parallel programming and in making the application faster for the future hardware to read it.

On the same edition of the MSDN Magazine there is also an article about a flavor of LINQ - PLINQ - that uses the TPL library to run queries in parallel.

PLINQ is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML query and automatically utilizes multiple processors or cores for execution when they are available. The change in programming model is tiny, meaning you don't need to be a concurrency guru to use it. In fact, threads and locks won't even come up unless you really want to dive under the hood to understand how it all works.

Unfortunately no download is available yet, but both articles say that they are going to release a CTP in Fall '07. I just subscribed to the blog of the Development Lead for the Parallel FX team, Joe Duffy so that I knew first hand when the library is released.