- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have one use case running a pipeline concurrently using parallel_while. The other pipeline can be spawned by this pipeline and is also run concurrently using parallel_while. It looks likethe performance is better if I don't use paralle_while to run the second pipeline. Does that mean nested parallel_while is not recommended?
Thanks,
Wallace
I have one use case running a pipeline concurrently using parallel_while. The other pipeline can be spawned by this pipeline and is also run concurrently using parallel_while. It looks likethe performance is better if I don't use paralle_while to run the second pipeline. Does that mean nested parallel_while is not recommended?
Thanks,
Wallace
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wallace,
In any parallel programming paradigm, if you use nested parallelism, and if all the available threads in your thread pool are running tasks from the outermost layer of nesting, then partitioning of the inner nest layers into sub-tasks are adding overhead.
If your outer while loop is small (compared to number of cores) then parallel-ize the inner loop. If the outer while loop is large (compared to number of cores) then parallel-ize the outer loop.
Also, if you have been reading a different thread here, if what you mean by pipeline is the parallel_pipeline, then make sure you are not nesting task_scheduler_init's (unless each are using a sub-set of available hw threads).
Jim Dempsey
In any parallel programming paradigm, if you use nested parallelism, and if all the available threads in your thread pool are running tasks from the outermost layer of nesting, then partitioning of the inner nest layers into sub-tasks are adding overhead.
If your outer while loop is small (compared to number of cores) then parallel-ize the inner loop. If the outer while loop is large (compared to number of cores) then parallel-ize the outer loop.
Also, if you have been reading a different thread here, if what you mean by pipeline is the parallel_pipeline, then make sure you are not nesting task_scheduler_init's (unless each are using a sub-set of available hw threads).
Jim Dempsey
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Wallace,
In any parallel programming paradigm, if you use nested parallelism, and if all the available threads in your thread pool are running tasks from the outermost layer of nesting, then partitioning of the inner nest layers into sub-tasks are adding overhead.
If your outer while loop is small (compared to number of cores) then parallel-ize the inner loop. If the outer while loop is large (compared to number of cores) then parallel-ize the outer loop.
Also, if you have been reading a different thread here, if what you mean by pipeline is the parallel_pipeline, then make sure you are not nesting task_scheduler_init's (unless each are using a sub-set of available hw threads).
Jim Dempsey
In any parallel programming paradigm, if you use nested parallelism, and if all the available threads in your thread pool are running tasks from the outermost layer of nesting, then partitioning of the inner nest layers into sub-tasks are adding overhead.
If your outer while loop is small (compared to number of cores) then parallel-ize the inner loop. If the outer while loop is large (compared to number of cores) then parallel-ize the outer loop.
Also, if you have been reading a different thread here, if what you mean by pipeline is the parallel_pipeline, then make sure you are not nesting task_scheduler_init's (unless each are using a sub-set of available hw threads).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Jim. This is very helpful information.
My machine has 8 cores. Sousually the outer pipeline would occupy all the avaiable threads. I will keep the inner pipeline single-threaded.It is also good to knowthe warning aboutnesting task_scheduler_init.
Wallace
My machine has 8 cores. Sousually the outer pipeline would occupy all the avaiable threads. I will keep the inner pipeline single-threaded.It is also good to knowthe warning aboutnesting task_scheduler_init.
Wallace
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"It is also good to knowthe warning aboutnesting task_scheduler_init."
Note that task_scheduler_init, which is a reference to shared data, is virtually cost-free when the thread is already engaged in TBB activity, so the warning is largely out of place, except that the use of task_scheduler_init in those locations may be symptomatic of a misconception that may need to be addressed. This has always been so.
Since TBB 2.2, use of task_scheduler_init before a user thread becomes engaged in TBB work is optional.
Since TBB 3.0, when you add a user thread (by any API), you may use task_scheduler_init to independently specify how many TBB worker threads N-1 may be used in addition to that user thread in its "arena", by specifying N as the total number of threads desired, and by doing so before the thread does any TBB-related work. Worker threads are shared between arenas, but participate in only one arena at a time to avoid entanglement that would sabotage required concurrency between arenas. Try to have a clear and valid reason before using this feature (even if it isn't linked with a perceived performance problem in another thread in this forum).
Note that task_scheduler_init, which is a reference to shared data, is virtually cost-free when the thread is already engaged in TBB activity, so the warning is largely out of place, except that the use of task_scheduler_init in those locations may be symptomatic of a misconception that may need to be addressed. This has always been so.
Since TBB 2.2, use of task_scheduler_init before a user thread becomes engaged in TBB work is optional.
Since TBB 3.0, when you add a user thread (by any API), you may use task_scheduler_init to independently specify how many TBB worker threads N-1 may be used in addition to that user thread in its "arena", by specifying N as the total number of threads desired, and by doing so before the thread does any TBB-related work. Worker threads are shared between arenas, but participate in only one arena at a time to avoid entanglement that would sabotage required concurrency between arenas. Try to have a clear and valid reason before using this feature (even if it isn't linked with a perceived performance problem in another thread in this forum).

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page