Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

TBB suitableness

idan192
Beginner
1,185 Views
EDITED:
I have a problem to solve. After studying TBB manuals, I've become quite unsure if the TBB is an appropriate tool to use. Could you please give me an advice on this?
The system is a kind of event-base simulator (it's a library). It works in cycles. Initially, the user supplies a set of functions he wants to be executed. Each cycle the scheduler does the following things:
1. Execute functions currently in the set - this is a computationally intensive step.
2. Do some book-keeping. Among other things, decide what functions to execute in the next cycle: the functions executed on step 1. can make requests, possibly scheduling the same functions for re-execution or adding new ones - this step is relatively computationally not intensive.
3. Goto 1., unless some function asked to terminate the simulation
Now, I could use parallel_while, parallel_invoke or maybe using the task scheduler directly, to run the 1st step, then execute step 2. serially, and this works fine. The problem actually arises, from another requirement: there should be a possibility for the user-supplied function to invoke a special command, let's say leave(), that will interrupt its execution immediately; should the scheduler decide to re-execute this function, it must resume its execution from the point where leave() had been invoked, with all its local variables restored.
So it looks like a requirement for preemption, and it seems, according to the TBB tutorial, using the task-scheduler is inappropriate here (because it maybe the case that a function may have to wait for a few cycles), however, actually, I don't need a task to be stuck in the scheduler all this time, I just need to save the stack state somehow.
Do you have any idea if this could be implemented effectively with TBB?
Cheers,
Idan
0 Kudos
11 Replies
robert-reed
Valued Contributor II
1,185 Views
You suggest preemption is somehow required, but is this problem any more complicated than something like this?

[cpp]    // Set up initial function list gen_tasks

    for (bool simulating = true; simulating; ) {
        parallel_for( blocked_range(0, genTaskCount),
            [=] (const blocked_range &r) {
                for (size_t i = r.begin(); i != r.end(); ++i) {
                    gen_task(gen_task.data);
                }
            }

            // Accumulate next tasks into gen_task list. Step the generation
            // and test for termination
    }[/cpp]

Basically, you build a list of tasks to execute and then launch a set of workers to process the current generation's list. They each may generate outputs and schedule tasks for the next generation; presumably each of these tasks can operate independently and safely in a a threaded environment. The comment at the loop would be expanded to code to rebuild the gen_task array with new task pointers and new data, adjusting the next gen's genTaskCount as appropriate. Implementation details can be worked out but I'd probably first try an enumerable thread specific collection for registering the next-gen task-data pairs (each work thread can collect these without interfering with other workers), then migrate the details from the thread local structures to gen_task during the generation change.

0 Kudos
idan192
Beginner
1,185 Views
Oh, I'm sorry, there was a mistake in the question; copy-pasted only a part of the question. Edited now.
0 Kudos
robert-reed
Valued Contributor II
1,185 Views
Yes, that does complicate the behavior quite a bit, though I still wouldn't call it preemption if I understand what you mean. The fact that it's an escape from a scheduled and executing function at a program specified boundary (the leave() call) makes it at worse a point for cooperative multitasking, and potentially blocking threads. If the leave() call could just execute and return to resume the user-function, the implementation could be pretty easy. If the scheduler must resume this invocation of the user function following the leave() call, it's a bit more hairy since the scheduler would somehow need to resume what would amount to a blocked thread. And it suggests other questions, like how many simultaneous user-generated leave() calls might you need to support, since each, if they require blocking a thread, may pose some serious resource concerns.
0 Kudos
idan192
Beginner
1,185 Views
I'm not sure I fully understand the options.
The user code example goes like this:
[cpp]void do_job(){
    while(1){
          x = obtain_global_variable_safely();
          std::cout << "Value is:" << x << std::endl;
          leave();
    }
}[/cpp]

I cannot change the API, it is written long ago. I thought of 4 possible solutions:

  1. Open inline assembler, save registers and stack. That would be quite messy and unportable. Actually, the current version does something like this, but resorts to services of external package to this job in a portable way. However, needless to say, it segfaults if I use it in MT context.
  2. Create threads (may be pthreads/tbb_threads/Boost.Thread), then wait on some condition variable. Here, I don't know, could it interfere with the task scheduler? Would there be too much overhead? Anyway, I have to save stack and registers, so context-switch overhead is probably inevitable, are there other kinds of overhead?
  3. I guess, if I could wait inside the task on a condition variable, but somehow, take the task out of the scheduler for the wait period, it could be the best. Is that possible?
It's a little bit frustrating: actually, there is nopreemptionhere, the 2nd kind of functions is just like 1st one, just it needs to be able to start at arbitrary point, stored before. Obviously, there is more overhead here (store/load the context), but it's not called preemption I guess.
Thanks,
Idan
0 Kudos
robert-reed
Valued Contributor II
1,185 Views
If that is the most complex user function you might expect, then the updated requirements are overly complex. I suspect instead that you've shown a trivial example of a user function. There really is no context to preserve in this example of leave(). Can a user function have multiple calls to leave()? (If that's true then it almost feels like a coroutining style of functional interaction.) What's the worst case user function you can share? What is the correct behavior for user functions in a multi-thread environment?
Are there constraints to guarantee that multiple, simultaneoususer functions play nice with each other?

Some of the declared constraints would be hard to implement in any environment; in particular the need to be able to place the leave() call in the midst of a instruction sequence so that the semanticsare to call the head of the function when initially scheduled, but then to treat the leave() call as a coroutine breakfrom whichexecution resumesthe next time the function is scheduled sounds like it would require some special sauce in the scheduler to handle context preservation and coroutining. Every leave() call imposes a place where a stack context would need to be preserved. Management of these contexts could get real hairy in a general case.
0 Kudos
Anton_Pegushin
New Contributor II
1,185 Views
Hello, I think there's a way to implement what you're looking for on top of tasks, but I don't see a way around the 'stack restoration' mechanism that you're already using, maybe someone else could advise. So, from what I understand task schedule recycling API should do the trick.
1. You spawn a task with a user function and then a leave() command comes. The function needs to stop, save the state and the whole task needs to be stored somewhere for possible resuming.
2. You implement leave() handler as following - in your task you flag that it's been cancelled (so that when it gets resumed, in the first lines of my_func_task::execute() function you could check for 'cancelled flag' and use your 'stack restoration' mechanism), save the state, and then recycle_as_continuation the current task. Set a non-zero reference counter and save the reference to this task, so you could explicitly spawn it, when task scheduler decides to resume it.
0 Kudos
idan192
Beginner
1,185 Views
Well, the user code may be an arbitrary function with multiple calls to leave(), so it is coroutine semantics. The only "good" thing is that the user functions are not allowed to communicate, except for special API that the library provides. So I can intercept communication to prevent data races. The communication API is a "channels" thing:
[cpp]class user_class{
   channel_t ch_in;
   channel_t ch_out;
   channel_t ch_err;

   void user_function(){
      int x = 5;
      while(!ch_in.try_read()){
          int z = 20;
          leave();
          ch_out.write(x*60+z);
          if(global_get_stop_flag())
                break;
      }
      if(global_stop_not_allowed())
         ch_err.write(-1);
}[/cpp]
Assume global_* do not pose a problem here.
0 Kudos
idan192
Beginner
1,185 Views
Hello Anton Pegushin,
Just to make sure I understand your answer, what you are saying is to use the recycle_as_continuation technique as follows:
1. When leave() is invoked, 'save the context', then set ref_count and call recycle_as_continuation and set a 'cancelled flag'.
2. When the scheduler wants to resume the task, it resumes the continuation, sees the 'cancelled flag' and thus goes to 'restore the context'.
Well, the real problem is to save/restore the context. Yes, I have a library that does he assembler code magic, but it's not thread-safe. It looks like there are no easy solutions here.
Thanks,
Idan
0 Kudos
05522541
Beginner
1,185 Views
Hi,

Can't you just increment task's ref_count and then call wait_for_all() to block the task, and then, when you want to resume it, decrement it from other thread?

Daniel
0 Kudos
Anton_Pegushin
New Contributor II
1,185 Views
Yes, that's what I suggested, right. My logic is that this 'recycling' approach should have smaller overhead than the one with you keeping many more threads in the pool than needed and waking an additional thread each time an already working thread blocks on 'leave()'.
I'm curious, why save-restore is not-thread-safe? It seems like you're operating on a thread local data in the context of the owner thread... I'm sure I'm missing something and would definitely like to know what it is :). Thanks.
0 Kudos
Anton_Pegushin
New Contributor II
1,185 Views
Hi, if the task is already running, incrementing it's reference counter from within it's execute() function will not accomplish anything unless the task is recycled. If it's not recycled it's going to destroyed after it's done executing.
wait_for_all does not block a task. It's an entry point for a master thread to join the worker threads team and start helping out with tasks execution.
0 Kudos
Reply