Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

Threading thousands of entities updates via TBB


Here is my first post on the forum, so I apologize if I missed something or if my question is quite newbie.
I am working on 3D real time software, with hundreds of "entities".
I want to thread parts of the serial process, which is spending 75% on the entities->udpate() calls. I think this has to be a pretty common use case.
For each entity, update consist in doing some heavy computation, on entities private data, according to a global context (which is read only, and already protected against multiple accesses in things such as logs).
My first approach was to use parallel_for in the main (per frame) udpate call, which loops on every entity.
I validated that my entity updateprocess is thread safe this way, but I was spending 20% of CPU time waiting, in waitForSingleObject <- tbb::internal::rml::private_worker::thread_rountine, called by callThreadStartEx said Intel Amplifier. Partial Locks and Waits analysis (before 10 MB data reached) says 10 seconds in Manual Reset Event for my main update method, 5 seconds in wsock regarding "completion port", 1 sec in TBB Scheduler for my main updatemethod.
I guess tbb is redoing all threads of processing for each update/ parallel_for , and it is quite costly for a 60 fps rate ?
Is there a better way to do this, in order to avoid waiting times ?
My second approach, in progress, is to use a task to update the entities.
Each task is reusing itself as follower, and uses an atomic ID counter to update until no more ID is left.
My main udpate loop start x f these tasks (maybe = number of cores) which reuses themselves.
My task operator() looks like :
// get an Id to process
unsigned int currentEntityId = globalContext->getId();
if ( Id != -1 )
// process update
globalContext->touchEntities()[currentEntityId]->update( );
//recycle_to_reexecute(); // deprecated
return this;
and my main update looks like :
unsigned int numberOfTasks = std::thread::hardware_concurrency();
for (unsigned int i = 0; i < numberOfTasks-1; i++ )
EntityUpdateTask& a = *new(tbb::task::allocate_root()) EntityUpdateTask(this);
tbb::task::spawn( a );
// the last wait for all to complete :
EntityUpdateTask& a = *new(tbb::task::allocate_root()) EntityUpdateTask(this);
With this code, I am getting an assertion failed (and several pure virtual call exceptions) in the scheduler
on t_next->state() == task::allocated
" if task::execute() returns task, it must be marked as allocated."
What did I do wrong ?
How can I get optimal performances (i e good concurrency and minimal wait) with this use case and TBB ?
As multithread protection, my update processing is only going through some spin_mutex protected data.
B. R.
0 Kudos
0 Replies