Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

How to supply items for flow::graph from another master thread?

Evgeniy_B_
Beginner
565 Views

We have a flow graph without an input node. Previously we did not dedicate a master thread for graph execution, all nodes were served by TBB worker threads, and items were fed by the same master thread which created the graph via try_put() method on a root node. Though that approach led to starvation in some scenarios (especially on systems with low hardware concurrency).

Now we want to dedicate a master thread to the flow graph to guarantee its execution. Though we still have to feed it from the main thread. Invoking try_put() from main thread renders the dedicated thread useless, since (to my understanding) try_put() spawns a task in the main thread's arena stealing from which is not supported for other master threads.

Can anyone please advice a straightforward solution to the problem?

0 Kudos
5 Replies
Christophe_H_Intel
565 Views

Hello Evgeniy,

The "simplest" solution is to compile with -DTBB_DEPRECATED_FLOW_ENQUEUE .  This forces flow::graph to use the old behavior (enqueueing instead of spawning, which is "almost FIFO.")  If there is no available thread, enqueueing starts one (so we might be oversubscribed for a period of time.)  This won't have a significant effect if the processing per node is not trivial.

You could also enqueue a task to do the try_put, but otherwise use spawning.  This might be better.  (We switched to spawning because it is more-efficient than enqueueing, but spawning causes problems in some very restricted cases which were too complex to describe efficiently.)

From curiousity, do you use a wait_for_all(), or do you use some other method to know when processing is done?

Regards,
Chris

0 Kudos
Evgeniy_B_
Beginner
565 Views

Hello Chris.

A bit more info on our use case, before I ask questions regarding the solutions you pointed out. First of all it's the same video analytics software I mentioned in the other thread. Under heavy load our software is in a situation when it cannot process all captured frames and has to skip some. As we didn't want to lose mission critical information, we have granted frame skip control to the processing stages: introducing tbb::flow::graph as our pipeline, tbb::flow::function_node and tbb::flow::multifunction_node as processing stages and building rate control on top of these node classes.

Here are the reasons why we preferred tbb::flow::graph over tbb::pipeline:

  • type safety and automatic item lifetime management
  • exception safety and propagation
  • flexible implementation which provided the means for a straightforward rate control implementation (overridable methods and templates)
  • ability to create stages that could output any number of items for one processed input item

Of course there are a couple of features available to tbb::pipeline only we would like to use:

  • item passing preference (i.e. tbb::pipeline prefers execution of a next stage for the current item, whereas tbb::flow:graph prefers execution of a next item for the current stage)
  • native ability to submit items from another master thread (via thread_bound_filter)

 

Christopher Huson (Intel) wrote:

The "simplest" solution is to compile with -DTBB_DEPRECATED_FLOW_ENQUEUE .  This forces flow::graph to use the old behavior (enqueueing instead of spawning, which is "almost FIFO.")  If there is no available thread, enqueueing starts one (so we might be oversubscribed for a period of time.)  This won't have a significant effect if the processing per node is not trivial.

How many temporary threads TBB could start when no free worker thread available? Is it one per arena? Won't this impede concurrency in case of constant influx of tasks from the graph?

 

Christopher Huson (Intel) wrote:

You could also enqueue a task to do the try_put, but otherwise use spawning.  This might be better.  (We switched to spawning because it is more-efficient than enqueueing, but spawning causes problems in some very restricted cases which were too complex to describe efficiently.)

We are using limiter_note as a graph input and we want to get the result of its try_put() method as soon as possible so that a frame skipped early on could be released. Is it possible to enqueue a try_put task with higher priority without affecting tasks it spawns during execution?

 

Christopher Huson (Intel) wrote:

From curiousity, do you use a wait_for_all(), or do you use some other method to know when processing is done?

No, we use a result queue (for pull mode) and callbacks (for push mode), since the processing is continuous.

0 Kudos
Anton_M_Intel
Employee
565 Views

Please look at task_arena feature which will help you specify that the work from your master threads should not be isolated as usual but processed in the same arena instead.

0 Kudos
Evgeniy_B_
Beginner
565 Views

Thanks Anton.

Is there a way to associate flow::graph with the task_arena, so that a master thread invoking flow::graph::wait_for_all() would participate in task stealing?

0 Kudos
Anton_M_Intel
Employee
565 Views

Evgeniy B. wrote:

Is there a way to associate flow::graph with the task_arena, so that a master thread invoking flow::graph::wait_for_all() would participate in task stealing?

Just execute flow graph operations inside task_arena::execute(), it will provide the specified arena context for all the operations running inside.

0 Kudos
Reply