TBB tasks and algorithms provide a nice way to achieve this for most jobs but AFAIK your work isn't guaranteed to get done until you wait for it much like futures. This makes a lot of sense to me but messages handling can be heavyweight and its completion doesn't cause any signals so I really don't care when it completes as long as it completes sometime and therefor I don't want to wait for it to do so when I publish it.
randomizer:This is not true. TBB tasks, since spawned, start executing in best effort manner, no matter waiting you for them to complete or not.
randomizer:But here is another caveat. Execution order of tasks is not guaranteed. So for example, you spawn task which recursively and infinitely respawns itself. Then one day this task also spawns second task which must stop whole processing. Second task is not guaranteed to run at all. So you will end up with a kind of deadlock.
I realize this for the multi-threaded case it will probably start executing immediately but what about the single-threaded case? If TBB spawns no worker threads does this mean that the task executes at spawn or at wait? I don't see any other options.
So task scheduling is not fair. I can understand that and probably prefer it for most things but I think I can get around that deadlock by having 2 empty tasks for synchronization that I use to double buffer all message handler tasks which would be their children. The active empty task would wait until its children complete and any new messages would create tasks to the back buffer empty task. After the active task completes its wait it would atomically swap itself with the back buffered empty task which could then wait for its children to complete. This gives me the high water mark and I think is somewhat fair.
MichaelMarcin:If I just spawn the tasks and don't wait for them in a single-threaded program then they will never run right? I suppose I'm trying to prevent starvation of tasks.
I thought more of your design, and I think there is some caveat I should warn you of. The TBB scheduler is not FIFO with regard to tasks, which might matter for message processing. In particular, a running task scheduler(including the one starting in wait_for_all) treats its own task pool as LIFO and starts processing most recent tasks first. It might be fine, but I think you'd better be aware of that. Also, a task scheduler is oblivious to parent tasks while draining the pool, which means that the scheduler, once started, won't exit until all tasks in its local pool complete; so if you use two WorkManager instances in the same thread in your application, then call WaitForWork for one of those, in fact all tasks initiated in the second manager will also complete except those that were stolen.
One particularly bad consequence for your design with two swapping roots, I think, is that after roots were swapped, new tasks that use the new root will be executed mainly before tasks that use the previous root, which completion you are waiting for. So waiting could take longer than expected. If this might become a problem, I suggest you use some flag to postpone spawning new tasks after swapping, till the moment wait_for_all finishes. Tasks being added by AddWork can be kept in some shared dynamic data structure (for TBB 2.0 & 2.1, a task_list can be used to both keep the tasks and then spawn them all at once, but in future versions spawning a task_list might become obsolete or inefficient due to possible changes in thetask pool representation).
Ok so this means that if I have a task which always results in spawning another task and it doesn't happen to be stolen by another pool then wait_for_all will never return?
TBB uses an approximation of Cilk-style scheduling (work LIFO; steal FIFO).It has good locality and load balancing properties while making certain space guarantees. It enables clean nesting of software components, analogous to the way serial code composes by nested subroutine calls.
FIFO has terrible locality and often has no space guarantees, which often leads to complex throttling mechanisms. Nonetheless FIFO scheduling definitely has its uses, notably for soft real-time. Finding a rationale marriage ofCilk-style and FIFO scheduling remains an open problem. If done rigorously it's a worthyfor aPh.D. thesis.
The "two roots" idea suggests that perhaps said marriage could be accomplished by some kind of epoch mechanism, where each task belongs to an epoch and older epochs are given precedence. There's a long way between a notion and a practical system.
Raf_Schietekat:"If done rigorously it's a worthyfor aPh.D. thesis." Shortest thesis ever: keep successor tasks in local queue, append to global FIFO structure at task steal time, only steal from it if no normal tasks are left?
(Added) Well, no local hidden queue (substitute participation in merge tree), only stealing should be delayed.
__declspec(thread) work_stealing_deque private_deque;
void spawn_task(task* t)
if (t = private_deque.pop())
if (t = steal_from_other_private_deques())
if (t = global_queue.pop())