Adi Levin's Blog for programmers

August 30, 2009

Constrained Parallelism

Filed under: Multithreading — Adi Levin @ 9:00 pm
Tags: , ,

This pattern is useful for parallelising complex algorithms, since it enables the programmer to exploit all possible parallelisms in the algorithm.

As input to the pattern, we have N tasks. For each pair of tasks i=0,1,…,N-1 and j=0,1,…,N-1 we are given two pieces of information: Must_Preceed(i,j) and Can_Run_In_Parallel(i,j). When Must_Preceed(i,j)=true, this means that task #j is not allowed to execute before task #i is completed. When Can_Run_In_Parallel(i,j)=true, it means that tasks i and j can run simultaneously in different threads. When it is false, we are not allowed to run them in parallel. The user also defines the number NTHREADS that says how many tasks are allowed to run in parallel at any given time. For scalability, one good policy is to set this number to the number of processors available on the machine.

When the user starts the algorithm, we first run tasks that have no required predecessors, and that are allowed to run in parallel to each-other. Whenever a task ends, we find another task that hasn’t been invoked yet, and that can be invoked given the constraints.

I recommend implementing a generic class that performs this pattern. It forces the programmer to think systematically about his algorithm and analyze its parallel structure.

Implementation using a master thread

The implementation requires NTHREADS worker threads and one extra thread whose job is to allocate tasks to worker threads and to be notified when tasks are finished. At the beginning, we calculate and store, for each task #i, the number NPRED(i) of required predecessors. If NPRED(i)=0, the task is allowed to run, as long as there are no running tasks j for which Can_Run_In_Parallel(i,j)=false.

In the master thread there should be a message loop (see my post on messages and message queues). Whenever a task finishes its work in a worker thread, it sends a message to the master thread using PostThreadMessage. The message says which task has just ended – for example task #i. The master thread recieves the message and decrements NPRED(j) for each j such that Must_Preceed(i,j)=true. After this update stage, it looks for tasks that can be invoked. We simply go over the list of tasks that haven’t been invoked yet, and look for tasks i for which NPRED(i)=0, and that are allowed to run in parallel to the tasks that are already running.

When parallelizing and algorithm using this pattern, it is important not to break the algorithm into too many tasks, because the fact that there is a master thread means that there is management overhead. Notice that the worst-case time complexity is at least O(N^2), in an efficient implementation, because we need to compute Must_Preceed(i,j) and Can_Run_In_Parallel(i,j) for all i and all j. However, if N is at most a few thousands, and the tasks are not trivial, this overhead is negligible.

“Focus-on-task” functionality

“Focus-on-task” is an added functionality that works nicely with this pattern. While the algorithm is running, the user may wish to give highest priority to a specific task #i. In such case, we should invoke only task #i and the tasks that are required for it to run (i.e. the tree of predecessors). After task #i is finished, the algorithm should return to its normal behavior and invoke all tasks until completion.

The implementation of this functionality is a relatively simple extension to the above: When we want to focus on a specific task, we send the master thread a special message (using PostThreadMessage), that instructs it to focus on task #i. Then, the master thread computes all predecessors of task #i, and continues as usual, except that the only tasks that belong to that tree are allowed to be invoked – until task #i is invoked (in which case we return to normal mode).

Implementation without a master thread (only a pool of worker threads is needed)

Suppose we only have the Must_Preceed constraint (i.e, we’re dropping Can_Run_In_Parallel). We can implement this pattern without a master thread. You’ll need a thread pool in which you can spawn tasks dynamically (unlike my thread_team example, in which you can only spawn a predefined collection of tasks).

You start by spawning all tasks that have no predecessors. There are now a number of tasks running in worker threads. Each task needs to be wrapped by a function that performs some the following operation after the task is complete: On the worker thread, when a task is finished, we decrement NPRED(j) for each j such that Must_Preceed(i,j)=true. If NPRED(j) reaches zero, we spawn task #j.

This is a simpler approach, but it doesn’t allow the “Focus-on-task” functionality, and doesn’t accept constraints of type Can_Run_In_Parallel.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: