Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Thread initialization

arek
Beginner
500 Views
Hello,
I'd like to use TBB to handle portion of parallel computations in my application (which already uses classical threads). One of the issues I've come across is thread initialization/termination.
Basically, each worker thread has to be properly initialized before the actual work is done - and properly terminated afterwards. From what I've read task_scheduler_observer is typically used to do this.
However, I have some additional requirements: all worker threads have to finish their initialization before any of them starts doing any actual work, similarly, all worker threads have to finish their work, before any of them starts executing its termination code; no new worker threads may join once the actual work has begun.
In other words, once any of the worker threads start working on its task, I have to be able to guarantee, that all worker threads (threads that are going to work on the tasks) have finished their initialization.
Is there a way of doing this in TBB?
0 Kudos
6 Replies
RafSchietekat
Valued Contributor III
500 Views
It is difficult to give the best possible advice without knowing why you are asking this question (perhaps there is a way to get around these perceived requirements)?
0 Kudos
arek
Beginner
500 Views
The key problem is that processing of a single data item is handled by a separate module - which I cannot really rewrite/alter (too much development effort, too risky, it is being used by some other subsystems, which would require changing too, etc.). It requires initialization of its internal data structures (in each thread it is being invoked from), but it also updates its data structures on the fly (typically to handle lazy evaluation of certain properties). The code that performs per-thread initialization isn't allowed to run in parallel with the lazy evaluation code - as it would mess things up. What's worse, executing lazy evaluation code leaves the internal data structures in a state that will lead to corruption if some other thread tries to initialize afterwards (before proper cleanup in all initialized threads).
With a simple threading model with a known number of explicitly handled threads this is fairly easy to handle - I can just use counters and condition variables to wait till all the threads have been properly initialized, but I don't see any obvious way of doing this in TBB.
0 Kudos
RafSchietekat
Valued Contributor III
500 Views
If you don't need the initialisation done by the thread itself, how about setting up a sufficient number of "things" and letting each thread grab what it needs?

If you need the initialisation done within the thread itself, perhaps you could register initialisation in thread-local storage, and lure in all the workers before and after the work you need done, using a parallel_for with enough parallel slack? That should work if there's nothing else going on in the program at the same time. Give it a deadline and then use a global condition variable to keep latecomers out, perhaps, allowing their hardware threads to go unused.

If there's no alternative, you just have to resort to dirty tricks, but it still seems strange. I'm curious if this is more common than I thought, or if anybody has a better idea.
0 Kudos
arek
Beginner
500 Views
Unfortunately, the intialisation has to happen within the thread, so a global pool won't work.
The TLS + dummy parallel_for approach is more or less what I've been thinking about, but I can't really afford leaving out worker threads - the software works on a range of multicore cpus - and on dual core machines this would mean pretty significant performance hit.
In the end, I think I'll give up on TBB for now and go with standard threads. The software I'm working on is quite old and the multithreading framework (also pretty old) was added as an afterthought - and is quite complex and with lots of quirks - so I wouldn't expect for those requirements to be that common.
Anyway, thank you for your patience and time spent on this problem. I really appreciate the effort of giving me meaningful advice given such vague problem description. Thanks again!
0 Kudos
Andrey_Marochko
New Contributor III
500 Views
Actually if you have complete control over TBB usage in your application, dummy parallel_for or a set of one per thread tasks could work for you. The trick is to use a barrier in a task to guarantee that all of them are distributed across different threads and thus all TBB workers are covered.

Just remember that this approach is fragile, as any other thread in your app trying to pull similar trick will cause mutual deadlock.

Here is an example of a helper class that can be used to both inituialize TBB and execute (de)initialization functions on all the threads:

[bash]#include "tbb/task_scheduler_init.h"
#include "tbb/atomic.h"
#include "tbb/task.h"

/** OnEntry and OnExit are pointers to functions that accept the following arguments: bool - specifies whether this is a worker (true) or master (false) thread; D - arbitrary data passed to TbbWorkersInitializer at the moment of its construction. **/ template class TbbWorkersInitializer : tbb::internal::no_assign { tbb::task_scheduler_init my_init; tbb::atomic my_barrier; tbb::empty_task &my_root; void (*my_notification)(bool, D); int my_num_threads; D my_data; friend class InitializerTask; void broadcast () { my_barrier = my_num_threads; my_root.set_ref_count( my_num_threads ); for ( int i = 1; i < my_num_threads; ++i ) tbb::task::spawn( *new(my_root.allocate_child()) InitializerTask(*this) ); my_notification( /*is_worker*/false, my_data ); // Wait for the workers to pick up a task each barrier(); // Wait for the workers to finish broadcast tasks my_root.wait_for_all(); } void barrier () { --my_barrier; tbb::internal::atomic_backoff bo; while ( my_barrier ) bo.pause(); } class InitializerTask : public tbb::task { TbbWorkersInitializer& my_owner; tbb::task* execute () { // Wait for other workers to pick up a task each my_owner.barrier(); my_owner.my_notification( /*is_worker*/true, my_owner.my_data ); return NULL; } public: InitializerTask ( TbbWorkersInitializer& owner ) : my_owner(owner) {} }; // class TbbWorkersInitializer::InitializerTask public: TbbWorkersInitializer ( D data, int num_threads = tbb::task_scheduler_init::default_num_threads() ) : my_init(num_threads) , my_num_threads(num_threads) , my_root(*new(tbb::task::allocate_root()) tbb::empty_task) , my_notification(&OnEntry) , my_data(data) { broadcast(); } ~TbbWorkersInitializer () { my_notification = &OnExit; broadcast(); tbb::task::destroy(my_root); } }; // TbbWorkersInitializer [/bash]
0 Kudos
arek
Beginner
500 Views
Thank you, Andrey! This is brilliant. Did a few tests using some mockup code and it worked like a charm :)
I should have enough control over TBB in my app to make sure no one ever tries to do a similar trick (or anything that would block worker threads from reaching the barrier).
Once more, Andrey, Raf, thanks a lot!
0 Kudos
Reply