Here is my first post on the forum, so I apologize if I missed something or if my question is quite newbie.
I am working on 3D real time software, with hundreds of "entities".
I want to thread parts of the serial process, which is spending 75% on the entities->udpate() calls. I think this has to be a pretty common use case.
For each entity, update consist in doing some heavy computation, on entities private data, according to a global context (which is read only, and already protected against multiple accesses in things such as logs).
My first approach was to use parallel_for in the main (per frame) udpate call, which loops on every entity.
I validated that my entity updateprocess is thread safe this way, but I was spending 20% of CPU time waiting, in waitForSingleObject <- tbb::internal::rml::private_worker::thread_rountine, called by callThreadStartEx said Intel Amplifier. Partial Locks and Waits analysis (before 10 MB data reached) says 10 seconds in Manual Reset Event for my main update method, 5 seconds in wsock regarding "completion port", 1 sec in TBB Scheduler for my main updatemethod.
I guess tbb is redoing all threads of processing for each update/ parallel_for , and it is quite costly for a 60 fps rate ?
Is there a better way to do this, in order to avoid waiting times ?
My second approach, in progress, is to use a task to update the entities.
Each task is reusing itself as follower, and uses an atomic ID counter to update until no more ID is left.
My main udpate loop start x f these tasks (maybe = number of cores) which reuses themselves.
My task operator() looks like :
// get an Id to process
unsigned int currentEntityId = globalContext->getId();