Solved: How to schedule and run tasks onto specific worker threads

Anastopoulos__Nikos · ‎05-08-2010

Hi,

Is there a way to specify the possible worker threads that may accommodate and execute a newly created task? My application needs to spawn continuously tasks, which I would like to have them executing on specific worker threads for performance reasons.

Furthermore, I would like to map the worker threads onto specific processors in order to impose some kind of affinity. Does the library provide a straightforward way to achieve this?

Thanks in advance,

Nick

ARCH_R_Intel · ‎05-10-2010

The only form of affinity supported by TBB is "replay affinity". You can hint that a task t1 is best run where a previous task t0 ran. The advantage of replay affinity is that it does not require detailed knowledge of the system topology and permits load rebalancing.

Also, consider carefully whether to use task::spawn or task::enqueue. Method task::spawn is usually best for recursive parallelism, where a parent task will wait on its children. Method task::enqueue is usually best when parallelism is flat, the number of children is unbounded, and the parent does not wait on its children.

View solution in original post

ARCH_R_Intel · ‎05-10-2010

The only form of affinity supported by TBB is "replay affinity". You can hint that a task t1 is best run where a previous task t0 ran. The advantage of replay affinity is that it does not require detailed knowledge of the system topology and permits load rebalancing.

Also, consider carefully whether to use task::spawn or task::enqueue. Method task::spawn is usually best for recursive parallelism, where a parent task will wait on its children. Method task::enqueue is usually best when parallelism is flat, the number of children is unbounded, and the parent does not wait on its children.

Anastopoulos__Nikos · ‎05-11-2010

Many thanks for the answer. Your second remark seems to be quite helpful and approaches a lot the execution scenario I have in mind. This includes mainly one "primary" task executing in parallel with one or more "subordinate" tasks. The primary task creates asynchronously subordinate tasks, which in turn may create new ones. Ideally, the primary task should execute without runtime obstructions, i.e. should run on a "dedicated" worker thread, without having to wait on other (subordinate) tasks and without being descheduled. All the other tasks may execute wherever else (apart from the primary task's worker thread), without strict timing requirements ("best effort"), but, preferably, in a fcfs fashion.So, task::enque seems to do the work for me, at first.

Regarding the affinity discussion, I will take a closer look at the affinity methods to see how I could implement the desired task affinity requirements.

ARCH_R_Intel · ‎05-12-2010

Yes, it sounds like task::enqueue is the best approach.

jimdempseyatthecove · ‎05-12-2010

Consider the following approach

Determine the suitable thread pool size (e.g. number of hardware threads available), call this nThreads.
Start the TBB thead pool indicating nThread involved (or after the fact determine the value of nThreads)
Create nThreads-1 concurrent queues.
The queues contain functors and/or void* to context infromation
Main thread runs to the point where it needs to request work to be performed by other thread. Here it packages up the functor (or function code) plus context informaton and one of
a) based on function or data selects queue
b) round-robbin selects queue
c) based on least full queue selects queue
d) based on other factors selects queue
Then enqueues into selected queue
When enqueue to concurrent queue is a first fill then perform your task enqueue of the task that services this queue.

Once this task starts, it will tend to have a sticky affinity. As long as your main task out-paces the enqueues into this tasks concurrent queue, it will remain running and reamain on the same core. This task only ends when concurrent queue it services becomes empty. Note, you may have to work out a race condition between the first fill and the empty queue (last empty)determination.

Note, you can establish the number of servers (consuming concurrent queue requests) at less than the nThreads-1. This would reserve some TBB threads for use in parallel_xxx request by the servers.
And do not have your server tasks spinwait for new request to enter its concurrent queue (use task exit/return and new task enqueue on first fill).

Jim Dempsey

Anastopoulos__Nikos · ‎05-12-2010

Thanks very much for the detailed reply! I really appreciate that.

Nick