I want to make a task tree where the parent nodes exist only to spawn children (they do a little bit of calculation and spawn only certain children). This is for branch-and-bound code with globally-accessible bounds. A tree structure is natural, and the depth-based scheduling is desirable, but there's no reason to keep the parent nodes around once they've spawned their children. If you do allocate_root() and spawn_root_and_wait() on the children, the parent still has to wait for the rootlike-children to finish, right? Is there any other way to do what I want?
"Is there any other way to do what I want?" But what do you want? What specific reason do you have for not wanting to keep the parents around? Are you certain that this is not premature optimisation?
You might make the tasks children of a long-lived task, but the dependence on a shared reference count might harm performance. You might let a parent take on the assignment of one of the children before joining with the rest of them, which seems generally a better choice (go have a look at parallel_for). The absolute best choice probably depends on the exact circumstances, but finding it may not be where you should direct your attention.
Ok, I can do that (yes parents are the same as children). But still, won't the children possibly contend with their siblings to de-reference the parent, and invoke expensive protection operations even if not? (or is it lock-free somehow? Haven't checked). Also I was worried about stack space, if the search goes very deep, plus I wondered if I was missing something obvious. But, if you use the method you suggest in parallel_for itself, it obviously can't be too bad :) Though, that has only 2 children at a time working on a given parent, my code could have up to the number of processors. I guess I could try reducing the branching factor and see if it makes a difference. Well ok anyway, thanks, to you, and Raf.
"my code could have up to the number of processors" The idea of using tasks is to create opportunities for concurrency that the scheduler can exploit, not to second-guess or spoon-feed the scheduler. Why would you take the number of processors into account?
"I guess I could try reducing the branching factor and see if it makes a difference." Reducing the branching factor increases the relative cost of keeping a parent around, so this is not necessarily a better approach.