I suppose that there will be likelyfrom 4 to 10 tasks in each batch. I suppose that each task will be taking from 0.01 to 5-10 ms to execute.
Each task from a batch may modify some sort of global data. And the next batch may operate on the same data, so in order to avoid possible collisions I separated the tasks into different batches.
How does the wait_for_all() method operate internally? Does it use system synchronization primitives that put a thread to sleep or does it use some a sort of loop with thread::yield() inside()?
Also, can you recommend anything special on implementing a similar system where tasks'dependencies form not a straight line but a graph? I have seen the example in the tbb's distribution but maybe something was left behind it?
What do you think about using system semaphore objects (I'm programming under windows) instead of wait_for_all()? ifwait_for_all has a loop inside it, won't it be more effective to allow the kernel scheduler to wake up the waiting thread?