Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

How to reschedule tasks but not as continuation?

klaim
Beginner
1,270 Views

I have a context where I first setup a bunch of custom tasks which all have a member object saying if the task should be rescheduled or not.
Basically, my custom task type execute() function looks like this:

[cpp]

tbb::task* execute() override
{

// ... here the code that execute some task specific work

switch( task_result )
{
case( RESCHEDULE ):
{
// Make sure this task is re-executed as required.
// See: http://stackoverflow.com/questions/11847689/how-to-reschedule-a-concurrent-task/11849373#11849373
this->increment_ref_count();
this->recycle_as_safe_continuation();
break;
}
case( END ):
{
// No need for rescheduling.
break;
}
default:
{
UCX_ASSERT_IMPOSSIBLE( "Unmanaged task result : '" << to_string(task_result) << "' Task: " << to_string(m_task.id()) );
break;
}
}

return nullptr;
}

[/cpp]

All my task instances are doing this. To instantiate them, I'm using the trick described in the doc to have the main thread waiting for all tasks to end. So I have an emty root task and each time I create a task I allocate it as a child of the root task, then I spawn it in the root task.


Currently I'm having a behaviour, where a task that is setup to reschedule until some specific time has passed is "blocking" the execution of other tasks.
Once this task is not rescheduling anymore, it ends and the other tasks are executed.
However, I need the other tasks to be executed in parallel. The current load isn't much
but I suspect my way of rescheduling tasks to be wrong
I don't have much data, and I'm not a multithreading specialist yet so maybe I'm wrong.

Anyway, I'm looking for a way to make sure that when I reschedule a task, it is put as last of the task queue of the worker thread, to make it easily stealable by other threads.
Each one of these tasks I'm spawning, when rescheduled, shouldn't be continuations of themselves as the current code is written,
they should just be rescheduled to be executed later, not immediately.

Unfortunately I tweaked the rescheduling code but didn't find any way to do this so far as any other code than the present one will assert in TBB code if I try a variant.

So I'm asking if there is a way safe way to reschedule tasks not as continuation but as if it was just spawned and put in some worker thread queue?

0 Kudos
28 Replies
RafSchietekat
Valued Contributor III
927 Views
Instead of (atomically) incrementing the reference count before calling recycle_as_safe_continuation(), simply set it to 1. The meaning of continuation is not about when it will be executed next but about having the same position in the hierarchy of dependencies as an unordered tree, and its timing will not be any different from that of a newly allocated and spawned task, as tasks are always added to the side of the deque where the worker thread takes the next task to execute (other threads steal from the other side). If you want near-FIFO scheduling, use enqueue() instead of spawn(). Maybe it's also useful to explain what you want to achieve, for a second opinion about this arrangement.
0 Kudos
klaim
Beginner
927 Views
Instead of (atomically) incrementing the reference count before calling recycle_as_safe_continuation(), simply set it to 1. The meaning of continuation is not about when it will be executed next but about having the same position in the hierarchy of dependencies as an unordered tree, and its timing will not be any different from that of a newly allocated and spawned task, as tasks are always added to the side of the deque where the worker thread takes the next task to execute (other threads steal from the other side). If you want near-FIFO scheduling, use enqueue() instead of spawn().
I did both suggestions with no apparent change of behaviour. I suppose setting 1 as ref count and using enqueue() instead of spawn() is indeed what I originally wanted so I'll keep the code this way for now. I still have a task blocking other tasks anyway.
Maybe it's also useful to explain what you want to achieve, for a second opinion about this arrangement.
Other than just having tasks that are guaranteed to not be interdependant in execution thread (which I supposed TBB would avoid as much as possible), I'm not sure what more to add. One important thing is that the rescheduling system (in my custom task class) is used to "delay" the execution of a taks too. What I mean is that the execution code will first check if it is "time" (based on a virtual time -that can be slower or paused- or on real time) to really execute the task. If not it just reschedule (after checking a potential other condition to not reschedule). Basically I want my tasks to be able to 1. delay their execution 2. be rescheduled an undefined times For 1) I don't know how to tell TBB that it can forget about the task for some time (is there one?), so I used the simpler way for now: reschedule until execution. For 2) the current code would be ok if it wasn't apparently blocking tasks when the task is just rescheduling until real execution. Strangely, I have other tasks that reschedule the same way than the blocking task but it's the only one that block some others. Maybe it's in my code, I'm actively trying to find the source of this but it's not obvious so far. Any suggestions on reorganizing this code to achieve this in an efficient way is welcome.
0 Kudos
klaim
Beginner
927 Views
Sorry, I totally forgot one important thing: the first tasks I want my system to execute are not spwned/enqueued immediately. I first accumulate them into a tbb::task_list and spawn them only when I reach the call to a run_and_wait() function, which when it is finished will terminate the whole application. Here is my original run_and_wait() implementation: [cpp] void run_and_wait() { m_root_task.spawn( m_task_list ); m_root_task.wait_for_all(); } [/cpp] When I schedule tasks, either the system is not running, in which case the task is added to the task list; or the system is running, in which case it was spawned, but I modified it to enqueue as you suggested. Now, this run_and_wait() function still spawn instead of enqueue, I totally forgot about this, so I changed it to this: [cpp] void run_and_wait() { while( !m_task_list.empty() ) { m_root_task.enqueue( m_task_list.pop_front() ); } m_root_task.wait_for_all(); } [/cpp] This apparently works with no locking. So my problem is solved. However, a confirmation that it's the right way to do this would reassure me. :) If you have any idea how to tell TBB to delay the execution of a task, other than having the task rescheduling itself until it's time to execute, I'm interested.
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
One important thing is that the rescheduling system (in my custom task class) is used to "delay" the execution of a taks too. What I mean is that the execution code will first check if it is "time" (based on a virtual time -that can be slower or paused- or on real time) to really execute the task. If not it just reschedule (after checking a potential other condition to not reschedule).
This means that you have many tasks constantly polling, creating a lot of overhead. Even worse, as I neglected to point out before, when you are rescheduling as a safe continuation with a reference count of one, at the end of execute() the scheduler would have to assume that this is a task that probably had children/predecessors that were spawned, stolen and have all finished executing, so I would expect that it would immediately spawn the task (putting it at the head of the queue where it would be picked up next) or even bypass that step as a possible or actual (?) internal optimisation when execute() returns nullptr, in either case effectively causing the worker thread to busy-wait for the right "time". But then I don't see how changing to enqueue() in run_and_wait() could possibly make any difference; maybe someone else could point out something I have overlooked? (Added) Is a previously enqueued task that is rescheduled as a continuation always enqueued again instead of spawned, perhaps?
0 Kudos
klaim
Beginner
927 Views
I think it works better because in my case I had tasks scheduled in this order: - A - B - C
0 Kudos
klaim
Beginner
927 Views
I think that it's because A takes more time than C. A isn't really polling like C does. C does a quick comparison of current time vs time to execute then reschedule (until it don't). A lock a mutex, then goes in a loop to execute a list of functions pointers (currently only 2). Then reschedule. It's more work. Not sure what is the exact impact here.
0 Kudos
klaim
Beginner
927 Views
(Added) Is a previously enqueued task that is rescheduled as a continuation always enqueued again instead of spawned, perhaps?
I'm not sure but I tried to enqueue explicitely but it can't work before returning from execute() so I guess that it gets re-enqueued as you say. We really need a specialist to clarify this because I don't understand all the TBB code...
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
Maybe it's purely by accident that the enqueue() appears to cure the problem... Perhaps the problem will reoccur if the enqueue()'ing happens in a different order and/or the program is allowed to run for a longer time (I was sort of grasping at straws about automatic re-enqueue)?
0 Kudos
klaim
Beginner
927 Views
I just tried by moving the A and B scheduling after the C scheduling. On first cold execution (automatically launched just after build) I got the same but than originally. All the following execution (without recompiling/linking anything) did execute the tasks correctly. So I agree that it is certainly an accident. Or I'm missing something about what is really happening.
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
OK, at least that red herring is out of the way. Another part of the explanation would be that you have only two hardware threads that are kept busy polling, leaving no resources available to steal and execute a third task. I don't assume you are trying this on a single core, are you? That would be even more problematic. TBB encourages optional parallelism without mandatory concurrency (of any degree), and this is quite the opposite of that. If such is the case (and even if it's not), the solution is to get rid of the busy-waiting. At least at first you should probably try without recycling. As an experiment, just enqueue() (not spawn()) a copy of the new task with the same polling logic (which can be allocated as an additional child of the dummy root task), and let the current task be automatically deleted at the end of the execute(). Next, instead of immediately enqueue()'ing the new task, give it to an independent thread that will enqueue() or perhaps spawn() it at the appropriate time, and stop polling inside the task. Once that works, you can start thinking about recycling again. If you're worried about the cost of not recycling in the interim, consider embedding a pointer to an allocated structure that you hand over from one task to the next. Otherwise, you'll have to have some way to assure that a task is not executed again while it is still executing from the previous run, which could happen owing to unfortunate timing related to preemption unless you either have an external thread poll for the reference count to drop or use a dummy child that is spawned at the appropriate time (which implies automatic spawn() of its parent), which may defeat any gains from recycling anyway. It would be nice to have some more hooks into the scheduler, but these are the workarounds familiar to me. That's only if you really really want to roll your own task-based solution, of course, because I still don't know why you wouldn't just use one of the existing algorithms instead (a pipeline comes to mind, where the input stage could pick the next item from a time-sorted heap of future events after consuming them from a concurrent queue).
0 Kudos
klaim
Beginner
927 Views
Raf Schietekat wrote:

Another part of the explanation would be that you have only two hardware threads that are kept busy polling, leaving no resources available to steal and execute a third task. I don't assume you are trying this on a single core, are you? That would be even more problematic. TBB encourages optional parallelism without mandatory concurrency (of any degree), and this is quite the opposite of that.

I'm using a Core i7 M 620, it have 4 core in the task manager (hyper-threading seem deactivated).

If such is the case (and even if it's not), the solution is to get rid of the busy-waiting.

At least at first you should probably try without recycling. As an experiment, just enqueue() (not spawn()) a copy of the new task with the same polling logic (which can be allocated as an additional child of the dummy root task), and let the current task be automatically deleted at the end of the execute().

Next, instead of immediately enqueue()'ing the new task, give it to an independent thread that will enqueue() or perhaps spawn() it at the appropriate time, and stop polling inside the task.

Once that works, you can start thinking about recycling again. If you're worried about the cost of not recycling in the interim, consider embedding a pointer to an allocated structure that you hand over from one task to the next. Otherwise, you'll have to have some way to assure that a task is not executed again while it is still executing from the previous run, which could happen owing to unfortunate timing related to preemption unless you either have an external thread poll for the reference count to drop or use a dummy child that is spawned at the appropriate time (which implies automatic spawn() of its parent), which may defeat any gains from recycling anyway. It would be nice to have some more hooks into the scheduler, but these are the workarounds familiar to me.

I will try a variant: I have a task that update virtual clocks. I could make the clock update check if there are tasks to enqueue. I believe that would be equivalent. Transfering the task body from one task instance to the other seems possible, I'll check exactly what it implies.

That's only if you really really want to roll your own task-based solution, of course, because I still don't know why you wouldn't just use one of the existing algorithms instead (a pipeline comes to mind, where the input stage could pick the next item from a time-sorted heap of future events after consuming them from a concurrent queue).

This is not a pipeline I'm setting up. Several different system have updating tasks that need to be executed in repetition. It's ok if they are busy, what's not ok is if one task blocks all the others. All the tasks are designed to be non-blocking. I first tried the Flow Graph as it indeed looks like what I was trying to do (but it's a bit less flexible). I hit a problem at the time after much experiments. I don't remember exactly yet what I figured but it made me think it was not the appropriate construct for my case.
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
I'm using a Core i7 M 620, it have 4 core in the task manager (hyper-threading seem deactivated).
Then it remains a mystery to me how one out of three tasks (as the exact grand total?) could possibly be blocked.
This is not a pipeline I'm setting up. Several different system have updating tasks that need to be executed in repetition.
It all depends on what you call a task: TBB tasks are more of an underlying mechanism for implementing higher-level algorithms, and recycling is more of an optimisation thing than a semantics thing, so you could definitely consider another mapping than one-to-one between concepts that merely happen to share the name "task". I initially thought of parallel_do/while(), but managing the order in a separate input stage made me consider a pipeline; if the fixed number of tokens in flight is no obstacle, why not (rhetorical)? But I don't know enough about the situation to say more.
0 Kudos
klaim
Beginner
927 Views
Raf Schietekat wrote:

Quote:

I'm using a Core i7 M 620, it have 4 core in the task manager (hyper-threading seem deactivated).

Then it remains a mystery to me how one out of three tasks (as the exact grand total?) could possibly be blocked.

It is not the grand total: there are more tasks, much of them being executed only once, some of them rescheduling until the end of the application (which is a game - but not architectured in a classical way).

Quote:

This is not a pipeline I'm setting up. Several different system have updating tasks that need to be executed in repetition.

It all depends on what you call a task: TBB tasks are more of an underlying mechanism for implementing higher-level algorithms, and recycling is more of an optimisation thing than a semantics thing, so you could definitely consider another mapping than one-to-one between concepts that merely happen to share the name "task". I initially thought of parallel_do/while(), but managing the order in a separate input stage made me consider a pipeline; if the fixed number of tokens in flight is no obstacle, why not (rhetorical)? But I don't know enough about the situation to say more.

Let me try to clarify then. First, I will use algorithms from tbb over this system. I'm not avoiding high level constructs, I just needed something different in addition of them. Second, the system in question is just a layer over the Task Scheduler. It's goal is to abstract it, in case I want to spawn a task manually, because I know I will have to port the system in other platforms that TBB can't compile in. Third, I'm in a context where the workloads widely differs in the second to second. However, I'm using a differnt task system outside my task scheduler abstraction. Forget about TBB for a minute, I'll get back to it. My task types are described lilke that: Task‹T› the template parameter being the kind of input it will work with. These Task objects can be put (moved, to be precise) into two kind of systems: - my task scheduler abstraction (which only accept Task‹void›): in this case it is assumed that the task will be executed by an unknown thread and that if rescheduled, there is no guarantee that it will always be the same thread. In practice, once moved into my task scheduler abstraction, I create a custom TBB-derived task instance, called AsyncTask_TBB, which can hold the Task and knows how to execute it correctly (taking into account it's infos, if it should be rescheduled, until when, if it have a end condition etc.). - TaskChain‹T› : it's basically a container which contain Task‹T›. It's thread-safe (at least In designed it this way but I don't have another pair of eyes to check... so far it seems to be correct). It have a execute( const T& ) function that will go through all the contained tasks in order, execute them (if enough time passed since last execute() call), remove the tasks that don't need to be rescheduled. I almost used a pipeline for this but a pipeline suppose that the tasks are interdepenent and don't have special behaviour like I just described, so I had to setup my own system. Now, the strategy I've setup is meant to be used this way: - I have several unrelated systems that need to be constantly updated, so they each have one Task‹void› which is moved in the task scheduler. This Task‹void› body will just call the execute() of a TaskChain‹WhateverInputIsNeeded›. This is like setting up different parallel pipelines that loop. - I have other systems that rely on the previously described system. They need to be updated in sync with their parent system. To do this, the parent system expose a way to insert other tasks into it's TaskChain, which means all the tasks related to this parent system will be executed in sync, on the same thread per update, in a specific order. It is important to understand that this TaskChain, the more or less pipeline, will have a variety of tasks being inserted and removed quite frequentely, but not at all update, so it becomes the synchronization bottleneck. - There is also a messaging system where needed but most messaging happen through network. Anyway, I use some message queues. Intuitively, the Flow Graph system was the most obvious choicce to setup this system, but it didn't fill all my needs. - The case with the task that is not executed immediately, delayed, is an exception and I really think it should be one of the parent systems that should keep a queue of these to be executed later, as you described in fact. I tried with rescheduling because I don't have yet such system in place and wanted to setup a delayed task to try. Part of this strategy is to avoid having to expose thread-safe interface for all my systems. Instead, interdependent systems just sync their tasks by using the same TaskChain, which means there is only a mutex lock when inserting a new task (and I believe it can be removed but I'm not sure how yet). The important thing I want to achieve is indeed that the workload between parent systems is automatically distributed by TBB. Which fails apparently sometime because of the way I'm rescheduling? Scheduling a clone instead of rescheduling a task might be a better solution. Semantically, it is. What I fear is the cost of instantiating this clone, which is why I did this. I which I could open source this system to show you and get more expert advice, but it's not possible at the moment. I'm not sure this description is clearer than looking at the code unfortunately.
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
It is not the grand total: there are more tasks, much of them being executed only once, some of them rescheduling until the end of the application (which is a game - but not architectured in a classical way).
OK… I thought you might have isolated a minimal reproducer with just 3 tasks. That's another relief! :-)
I know I will have to port the system in other platforms that TBB can't compile in
Any hint which one? Maybe it's just "not yet"...
Scheduling a clone instead of rescheduling a task might be a better solution. Semantically, it is. What I fear is the cost of instantiating this clone, which is why I did this.
It doesn't have to be a full clone, though. It's like the new C++ move (vs. copy), because the old task's only remaining job is to get deleted, and it's cheap enough to move a pointer. With the scalable allocator, overhead is certainly very manageable, and negligible compared to polling. It would be nicer, though, to have more ways of recycling, because now you can't re-enqueue without polling for the end of the current execute (as far as I know?).
I'm not sure this description is clearer than looking at the code unfortunately.
Thanks for trying. Perhaps it will get clearer when I reread it tomorrow morning. :-) (Correction after next posting) i->I
0 Kudos
klaim
Beginner
927 Views
Raf Schietekat wrote:

OK… I thought you might have isolated a minimal reproducer with just 3 tasks. That's another relief! :-)

Indeed. ^__^;

Quote:

I know I will have to port the system in other platforms that TBB can't compile in

Any hint which one? Maybe it's just "not yet"...

Well Android, and potentially some console platforms. Frankly it's the least important of the reasons, so you can ignore it.

Quote:

Scheduling a clone instead of rescheduling a task might be a better solution. Semantically, it is. What I fear is the cost of instantiating this clone, which is why I did this.

It doesn't have to be a full clone, though. It's like the new C++ move (vs. copy), because the old task's only remaining job is to get deleted, and it's cheap enough to move a pointer. With the scalable allocator, overhead is certainly very manageable, and negligible compared to polling. It would be nicer, though, to have more ways of recycling, because now you can't re-enqueue without polling for the end of the current execute (as far as i know?).

It looks like it's not a big change in my code so I will first try this see if whatever the order I get good results. Fortunately I isolated the code enough to try different TBB use without changing the semantic of the program (order of task execution is not important in the case they are asynchronous tasks).

Quote:

I'm not sure this description is clearer than looking at the code unfortunately.

Thanks for trying. Perhaps it will get clearer when I reread it tomorrow morning. :-)

Sorry about that. ;____;
0 Kudos
klaim
Beginner
927 Views
So, as suggested, I changed this code: [cpp] tbb::task* execute() override // this is AsyncTas_TBB::execute(), my custom tbb::task derived class { // blah blah blah switch( task_result ) { case( TaskEndRequest::RESCHEDULE ): { // Make sure this task is re-executed as required. // See: http://stackoverflow.com/questions/11847689/how-to-reschedule-a-concurrent-task/11849373#11849373 this->set_ref_count( 1 ); this->recycle_as_safe_continuation(); break; } // blahblah } return nullptr; } [/cpp] To this: [cpp] tbb::task* execute() override // this is AsyncTas_TBB::execute(), my custom tbb::task derived class { // blah blah blah switch( task_result ) { case( TaskEndRequest::RESCHEDULE ): { // Make sure this task is re-executed as required. m_task_scheduler.schedule( std::move( m_task ) ); break; } // blahblah } return nullptr; } [/cpp] Here m_task is my special Task type object which is movable-only (like std::thread if you will) and m_task_scheduler is the unique task scheduler abstration (a reference to it). So basically I just take the task content, reschedule it again which will create another AsyncTas_TBB (my tbb::task based task) and enqueue it. It is exactly like if I cloned the executing task as all informations about the task is in m_task. This apparently work, whatever the order of the different tasks I'm trying. :D Obviously I should wait to have a more complete application (I have like 15% of it right now) to make sure but it seems like the correct solution. I just hope reallocating tasks in loop this way will not be a problem in the future. Anyway it seems to be closer to the semantic I needed.
0 Kudos
RafSchietekat
Valued Contributor III
927 Views
Let me try to clarify then.
I hope I picked up a little more on a second reading. It's true that a pipeline always runs items through to the end. I don't know how many applications would benefit from being able to have them dropped somewhere in the middle, perhaps quite a few. Now you have to simulate that (using a special pointer value, or perhaps something like "bool m_zombie"), which is not very elegant and also suboptimal. More seriously, pipelines are also quite unfair to each other, so you would have to run each from a separate application thread (giving them their own arenas to avoid starvation-causing entanglement), but this could lead to oversubscription instead. That limitation definitely warrants an overhaul to allow several pipelines to be run together as a group. Meanwhile, you should use your own solution (especially if you also want to insert additional "stages" while the "pipeline" is running). TBB was originally conceived to optimise throughput, where it didn't matter that sometimes the scheduling was very unfair. In your system, take care to avoid spawn() where it could lead to high-level unfairness, so use enqueue() instead in the right places. Note that running an enqueued task takes priority over stealing a spawned task (last time I looked anyway), in an effort to improve locality (good for performance), but that also means it still does not entirely prioritise minimising backlog, so, while starvation is now avoided, there's still some lingering unfairness that may be relevant in a real-time application, although I haven't seen any complaints about this yet so either it's not very serious or I'm just seeing ghosts. Recycling as a continuation internally does the same as spawn()'ing the recycled task, and you may have to work around that, as you have already experienced. Perhaps others can suggest more general patterns for architecting this kind of application?
Well Android, and potentially some console platforms. Frankly it's the least important of the reasons, so you can ignore it.
Android is currently being targeted: check out (as in have a look at) the latest development release.
m_task_scheduler.schedule( std::move( m_task ) );
I only used move/copy as an analogy. Maybe there are situations where perhaps locality considerations make move the winner, but don't forget about pointers.
This apparently work, whatever the order of the different tasks I'm trying. :D
Great!
0 Kudos
klaim
Beginner
927 Views
Raf Schietekat wrote:
Meanwhile, you should use your own solution (especially if you also want to insert additional "stages" while the "pipeline" is running).
It is actually certainly the most important feature to add and remove work from a "pipeline" in my case as it changes depending on the context from second to second.

Android is currently being targeted: check out (as in have a look at) the latest development release.

Good news!

Quote:

m_task_scheduler.schedule( std::move( m_task ) );

I only used move/copy as an analogy. Maybe there are situations where perhaps locality considerations make move the winner, but don't forget about pointers.

I didn't move it because of your analogy, but because the object is move-only semantic. I agree pointers are supposed to be faster (this object is bigger than a pointer as it hold all extra information about the task) but it makes the code simpler to understand so far and wasn't a bottleneck so I'll keep it that way for now. It also avoid me to have to use new, so makes things more predictable. I'll switch to unique pointers if I see it as a performance bottleneck.
0 Kudos
Wooyoung_K_Intel
Employee
927 Views
If you want to use 'recycle_as_safe_continuation' to reuse a task and reschedule it at a later time, you should set the ref-count to at least 2 and when the time comes, you need to explicitly decrement the count, and if the ref-count is 0, then spawn it again. If you set it to 1, by the time the thread returns to the TBB scheduler, it decrements the ref-count, sees it go down to 0, and immediately re-schedule it by pushing it onto the taskq. Then, it is picked up by the scheduler, and thereby creates an illusion that it blocks execution of other tasks. Explicitly manipulating ref-count is extremely prone to errors, we don't usually recommend using it unless users have clear need to use it. Please consult the TBB reference manual for differences between recycle_as_continuation() and recycle_as_safe_continuation().
0 Kudos
RafSchietekat
Valued Contributor III
901 Views
If you want to use 'recycle_as_safe_continuation' to reuse a task and reschedule it at a later time, you should set the ref-count to at least 2 and when the time comes, you need to explicitly decrement the count, and if the ref-count is 0, then spawn it again.
This was already considered, but the problem is that scheduling (re-)spawned tasks is done in the "wrong" order relative to "time" (by the same thread at least), while reliably re-enqueuing requires a polling loop instead of a simple conditional enqueue() (which would otherwise make the task re-spawned or re-enqueued based on execution timing issues): the reference count must be seen to have dropped to 1 already (to avoid re-spawn at the end of execution) before it is set to 0 (just to avoid any possible complaints during debugging, because it is not needed here) and the task then unconditionally enqueued. Because polling can easily lead to problems, I suggested to abandon the idea of recycling the TBB tasks altogether. On the other hand, if a central thread does the timed spawning, the tasks will always be stolen by the workers, so they don't need to be enqueued for scheduling in the right order, and the problem described above goes away. But routine stealing sounds like a bad lifestyle, so then I would wonder how costly it is relative to taking from the enqueue() queue (is it more expensive or not?), and also whether it wouldn't be better to consider affinity (and what that would mean for relative timing?).
0 Kudos
Reply