- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
void graph_test2() { const int NPROC = 20; tbb::flow::graph g; tbb::flow::broadcast_node< tbb::flow::continue_msg > start( g ); std::vector< tbb::flow::continue_node< int > > workers; workers.reserve( NPROC ); std::vector<double> SUM( NPROC, 0. ); for(int i=0; i<NPROC; ++i) { auto work = [&, i](const tbb::flow::continue_msg &) -> int { double & sum = SUM; for(int k = 0; k < 1000000000; ++ k) { sum += k; } return i; }; workers.push_back( tbb::flow::continue_node< int > ( g, work ) ); tbb::flow::make_edge( start, workers ); } start.try_put( tbb::flow::continue_msg() ); g.wait_for_all(); }
I have 20-cores machine. I made a simple graph with 20 tasks to run in parallel. But graph runs only 19 tasks in parallel. 20-th core 100% occupied by wait_for_all(); It looks like waste of CPU resources. I use TBB 2017.2, Windows 7, MSVS 2015
Here is Amplifier's picture:
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Popov,
From documentation:
https://www.threadingbuildingblocks.org/docs/help/index.htm
void wait_for_all() Blocks until all tasks associated with the root task have completed and the number of decrement_wait_count calls equals the number of increment_wait_count calls. Because it calls wait_for_all on the root graph task, the calling thread may participate in work-stealing while it is blocked. |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Maxim,
From the provided source code I would expect that the main thread should steal and process one task (inside g.wait_for_all()).
To understand why it's not happening we need to reproduce and investigate it locally. Please provide a full workable reproducer (including all includes and main() function).
BR, Oleg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Maxim,
Thank you for the reproducer! It helped find the root cause of the problem, which is different task_arena entities used in worker and master threads.
As a workaround one should initialize TBB at first, e.g. via call to dummy parallel_for or via instantiation of task_scheduler_init class.
Regards, Aleksei.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aleksei,
Thank you for your reply! Good to know that there is no real problem, because in production code we use task_scheduler_init.
Regards, Maxim
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page