Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

on-demand / lazy 'task' generation

Hi, I thought I'd share what must be a common pattern that I didn't
think of immediately, that contains the keywords I was searching for

The job was to search a huge, 'embarrassingly parallel' space and I
was thinking in terms of a bunch of discrete pieces of work that
threads would grab as they became free. The work needed to be
generated lazily, on-demand, as there would be too many for a list,
and generating them at once would take wastefully long and be serial
anyway. There doesn't seem to be any TBB structure to support such
lazy generation; parallel_do_feeder (eg) doesn't seem to be dynamic.
But, of course a better solution is stop thinking in terms of
organizing the 'todo' objects, and instead just have as a threadpool,
some fixed number of TBB root tasks which request work from the
central generator as each becomes ready for more.
0 Kudos
1 Reply


from this very high-level description, it seems to me that TBB would perfectly fit your problem if you know how to apply it. However there is not enough details to understand what TBB pattern is the best fit; it might be parallel_for/reduce, parallel_do, or pipeline. The fixed number of root tasks that repeatedly go to a central work store (or generator)for the next piece of work is the last thing I would consider, because in fact it means using threads, not tasks.

Tasks are lightweight objects, andshould be created on demand, not inadvance.TBB supports the pool of worker threads that, if free,either take another available task from its own task pool or steal it from another thread that has some work to do. With TBB, it's better to think of the whole job as a field where multiple workers can take pieces as they wish, rather than a central job store with a single door to enter.

For example, in case your work space is known apriori and could be recursively divided into smaller pieces, tbb::parallel_for would work well; it applies recursive divide-and-conquer strategy where your main thread starts with the whole workspace and recursively divides it in half, on each step allowing one of the halves to be stolen by a worker thread, which usesthe same strategy. Thus the whole workspace is efficiently decentralized, and at any time the number of tasks is O(P*log N) where P is the number of threads and N is the size of the workspace. When a portion of work is not worth further splitting, the thread that owns it starts real processing.

If the initial work can be extended during the processing, then tbb::parallel_do is the best fit. If the work comes irregularly from an external source, or if the processing consist of several stages and some require ordered execution, then I would think of applying tbb::pipeline.

In case you need more help, could you please describe your problem in more details, for better understanding of its nature, of where parallelismis, of possible dependencies, and etc.?

0 Kudos