Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2464 Discussions

task_group vs. parallel_invoke scalability / template args

05522541
Beginner
658 Views
Hello,

In the reference (315415-007US) it is said:

Creating a large number of tasks for a single task_group is not scalable, because task
creation becomes a serial bottleneck. If creating more than a small number of
concurrent tasks, consider using parallel_for (721H4.4) or parallel_invoke
(72H4.12)
instead, or structure the spawning as a recursive tree.


However, due to C++98 limitation regarding template parameter count, max. number of functions that parallel_invoke takes is 10 according to the reference. Then to what degree a task_group is supposed to be not scalable? I mean, ten tasks doesn't look like a great scalability, does it? I feel I'm missing the point here.

By the way, as a suggestion, why not to parametrize number of template arguments to parallel_invoke with macro definition, like guys in Boost MPL do, for example?

Regards,

Daniel
0 Kudos
1 Solution
Anton_Pegushin
New Contributor II
658 Views
Hello Daniel,
the scalability issue with task_group is that calling task_group::run() in the main thread produces a task and puts it in a main thread's task pool (spawns it). For the task to be executed it needs to be stolen by one of the worker threads. If those 'task-group'-tasks do not spawn any children tasks of their own (by calling parallel algorithms for instance), then you run into a situation of one thread producing tasks and the rest of the team waisting a lot of time on: (1) unsuccessful stealing attempts (they steal from random neighbors), (2) stealing itself, which is much more expensive then taking out a task out of a local task queue and (3) on synchronization during concurrent access of the producer's thread task pool. This problem is also refered to as 'linear spawning' problem and 'recursive spawning' is the one recommended to avoid/hide task stealing and maintenance overhead.
So what you could do is use task_group, but make sure that each function spawned by task group recursively creates children tasks (by calling parallel algorithms like parallel_for or reduce, parallel_invoke or by creating a task_group and running tasks from it).
Parallel_invoke is implemented so that tasks for each potentiall parallel function call are spawned recursively (form a tree of tasks). Each function in parallel_invoke can in turn call parallel_invoke and therefore recursively spawn more tasks than just initial 10.

View solution in original post

0 Kudos
3 Replies
Anton_Pegushin
New Contributor II
659 Views
Hello Daniel,
the scalability issue with task_group is that calling task_group::run() in the main thread produces a task and puts it in a main thread's task pool (spawns it). For the task to be executed it needs to be stolen by one of the worker threads. If those 'task-group'-tasks do not spawn any children tasks of their own (by calling parallel algorithms for instance), then you run into a situation of one thread producing tasks and the rest of the team waisting a lot of time on: (1) unsuccessful stealing attempts (they steal from random neighbors), (2) stealing itself, which is much more expensive then taking out a task out of a local task queue and (3) on synchronization during concurrent access of the producer's thread task pool. This problem is also refered to as 'linear spawning' problem and 'recursive spawning' is the one recommended to avoid/hide task stealing and maintenance overhead.
So what you could do is use task_group, but make sure that each function spawned by task group recursively creates children tasks (by calling parallel algorithms like parallel_for or reduce, parallel_invoke or by creating a task_group and running tasks from it).
Parallel_invoke is implemented so that tasks for each potentiall parallel function call are spawned recursively (form a tree of tasks). Each function in parallel_invoke can in turn call parallel_invoke and therefore recursively spawn more tasks than just initial 10.
0 Kudos
05522541
Beginner
658 Views
According to your explanation, filling a task_list with root tasks and then spawning it is at least as bad astask_group::run(), isn't it?
Thanks for your answer
0 Kudos
Anton_Pegushin
New Contributor II
658 Views
Hi, if those root tasks don't spawn any children tasks, then yes, it's exactly the same as using task_group.
0 Kudos
Reply