Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2465 Discussions

flow::graph : graph.wait_for_all() loads one core while do nothing usefull

Popov__Maxim
Beginner
699 Views
void graph_test2()
{
	const int NPROC = 20;

	tbb::flow::graph g;

	tbb::flow::broadcast_node< tbb::flow::continue_msg > start( g );

	std::vector< tbb::flow::continue_node< int > > workers;
	workers.reserve( NPROC );

	std::vector<double> SUM( NPROC, 0. );

	for(int i=0; i<NPROC; ++i) {
		auto work = [&, i](const tbb::flow::continue_msg &) -> int
		{
			double & sum = SUM;
			for(int k = 0; k < 1000000000; ++ k) {
				sum += k;
			}

			return i;
		};

		workers.push_back( tbb::flow::continue_node< int > ( g, work ) );

		tbb::flow::make_edge( start, workers );
	}

	start.try_put( tbb::flow::continue_msg() );
	g.wait_for_all();
}

I have 20-cores machine. I made a simple graph with 20 tasks to run in parallel. But graph runs only 19 tasks in parallel. 20-th core 100% occupied by wait_for_all(); It looks like waste of CPU resources. I use TBB 2017.2, Windows 7, MSVS 2015

Here is Amplifier's picture:

2017-03-05_16-25-51.png

0 Kudos
5 Replies
ali_n_
Beginner
699 Views

Hi Popov,

From documentation:

https://www.threadingbuildingblocks.org/docs/help/index.htm

void wait_for_all()

Blocks until all tasks associated with the root task have completed and the number of decrement_wait_count calls equals the number of increment_wait_count calls. Because it calls wait_for_all on the root graph task, the calling thread may participate in work-stealing while it is blocked.

 

0 Kudos
Oleg_L_Intel
Employee
699 Views

Hi Maxim,

From the provided source code I would expect that the main thread should steal and process one task (inside g.wait_for_all()).

To understand why it's not happening we need to reproduce and investigate it locally. Please provide a full workable reproducer (including all includes and main() function).

BR, Oleg

 

0 Kudos
Popov__Maxim
Beginner
699 Views

Hi Oleg,

Full MSVS project attached.

BR, Maksim

0 Kudos
Aleksei_F_Intel
Employee
699 Views

Hi Maxim,

Thank you for the reproducer! It helped find the root cause of the problem, which is different task_arena entities used in worker and master threads.

As a workaround one should initialize TBB at first, e.g. via call to dummy parallel_for or via instantiation of task_scheduler_init class.

Regards, Aleksei.

0 Kudos
Popov__Maxim
Beginner
699 Views

Hi Aleksei,

Thank you for your reply! Good to know that there is no real problem, because in production code we use task_scheduler_init.

Regards, Maxim

0 Kudos
Reply