Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.
2455 Discussions

Is it mandatory to wait for a root task before calling terminate?


We are using the tbb task scheduler to let it organize the work we give it. Currently we create a root task and wait for the root task to finish:

    tbb::task_scheduler_init taskSchedulerInitializationObject();

    _rootTbbTask = new(tbb::task::allocate_root()) TbbRootMessageTask(std::bind(&RootTbbTaskMessage));


RootTbbTaskMessage() is a function which does some initialisation and then waits indefinitely at a condition variable

All further tasks are either spawned or enqueued by using


thus making sure that tbb::task::spawn_root_and_wait(*_rootTbbTask) only continues when all children have finished

In order to terminate the task scheduler we unblock RootTbbTaskMessage() and call terminate after spawn_root_and_wait(...)

My question:

Is it possible to avoid 'allocate_additional_child' and instead create all further tasks as root tasks via tbb::task::allocate_root()? Or is it essential that all tasks have finished before 'tbb::task::spawn_root_and_wait(*_rootTbbTask)' lets the thread continue and terminate is called?

We tried to create all tasks via tbb::task::allocate_root() which worked fine for Windows system. However on Linux we got sporadic crashes during


which made us wonder if our approach is ok.


0 Kudos
2 Replies

Hi Mathias,

Could you provide the complete call stack of the thread that causes the crash, please? It is quite difficult to say what is wrong without looking at the code. Looks like that there are some synchronization issues. I can only give general recommendations such as always wait for completion of tasks, be careful with reference counting when creating children (it is often a place for races). In addition, be careful with enqueued tasks because they are usually used in a "fire-and-forget" scenarios and can access data structures that was already deallocated by the thread that enqueued these tasks (because the thread, which does not wait enqueued tasks, can leave its scope and call some destructors).

Yes, it is possible to create all tasks as root tasks; however, it seems that you do not need to use tasks directly. 

The Intel TBB Task Scheduler is very sophisticated in implementation and interfaces. Task based programming has a lot of particularities that should be considered to achieve correctness and performance. In my experience, direct task manipulation often leads to races that are sometimes exposed in very strange situations. It is highly recommended to avoid using tasks directly, it is better to use high-level algorithms like tbb::task_group, tbb::parallel_invoke and others (feel free to explain your algorithm structure if you need advice). If you believe that the task based approach is the most suitable for the algorithm then I recommend reading the Task Scheduler section (and its subsections) carefully to better understand the Intel TBB Task Scheduler behavior and recommended usage patterns.

What is the purpose to block a thread on a condition variable? It can lead to inefficiency because Intel TBB uses thread pool with a limited number of threads so the approach can underutilize the system (or even lead to a deadlock if there are many such threads waiting on synchronization primitives).

Regards, Alex

0 Kudos

Hi Alex,

thank you very much for your reply!

We found the reason for the sporadic crashes. As you supposed it was a race condition in which one of our instances did not live anymore although TBB still executed using that instance.

We are certain that we need TBB Task Scheduler directly and now have a stable implementation. But it was very helpful to have your statement that everything can be done using root tasks. We do that now and make sure that none of our instances are used anymore when Tbb terminates.

We furthermore make sure that we create more TBB threads than necessary and block only as many such that the system is not underutilized. We need that to wait for IO efficiently.

Regards, Mathias

0 Kudos