OpenMP TASK ordering

jansson · ‎03-11-2011

Hi,
In what order are tasks executed when I parallelize my program using a OpenMP task construct?

Is there a way to affect in what order tasks are actually executed, for example in the code below?

In my program the CPU time for different tasks differs by orders of magnitude.
I suspect that I need to get the largest tasks started first to get nice speedup.

Does it matter if I use "master" instead of "single" directive below?

Any ideas?

!$omp parallel

!$omp single

    do i=1,10000000

!$omp task

    call process(item(i))

!$omp end task

    end do

!$omp end single

!$omp end parallel


Have a nice weekend!
Regards,
Magnus

TimP · ‎03-11-2011

It looks like you have specified everything to be performed by one thread, which seems contrary to your desire to get a speedup. Changing from single to master wouldn't speed it up.
It's not clear why you use task rather than omp do. If you used omp do, you could use ordered, but that doesn't appear to be what you want, as you still don't mean to prevent arbitrary parallelism.
If your goal is to start the process(item()) in something close to a specified order (at least specifying which group of tasks should start first, on multiple threads), you could accomplish it by sorting them in your priority order and running them under omp do schedule(dynamic).

jansson · ‎03-11-2011

The code was just meant to illustrates what my program does, I got the code from the intel compiler manual about the task directive.
My own code is looping through a linked list and forming a task for each element in the list. This is done inside a iterative algorithm many times, on the order of 20-100 times.

If I get you correct I should reorder the elements in my linked list after timing the tasks the first loop through all elements, and there is nothing similar to the STATIC, DYNAMIC and/or GUIDED Clauses for tasks.

Regards,
Magnus

jimdempseyatthecove · ‎03-13-2011

There are several ways to handle linked lists in parallel. For a list processed by n threads (team member m = 0:n-1)

a)each thread takes the n't item starting at m
b) create a table of starting points s(n+1), one for each team member plus one. Each thread starts at node s(m) and ends at end of list or non-inclusive of node s(m+1). Initially you may populate the start table with equal strides through the linked list. If the processing loads are unbalanced, you can heuristically readjust the starting points during execution.
c) when node processing is relatively large, all threads atomically advance a shared next pointer through the list.

Jim Dempsey