- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
template
class tbb_future
{
public:
tbb_future(T (*f)()); //call the passed function
T get_value() //get the value and block until computation is finished
}
ideally get_value will not just spin and can be called by multiple threads...
Mike
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It should be fairly straightforward to use tasks and dependencies, if your program will tolerate performance-wise that later callers of get_value() will be blocked until the result has been evaluated (not being able to do anything useful in the meantime), unless somebody could remind me how to simulate diamond dependencies with the current state of TBB. Do you want to try that without knowing in advance that it will give a satisfying result?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It should be fairly straightforward to use tasks and dependencies, if your program will tolerate performance-wise that later callers of get_value() will be blocked until the result has been evaluated (not being able to do anything useful in the meantime), unless somebody could remind me how to simulate diamond dependencies with the current state of TBB. Do you want to try that without knowing in advance that it will give a satisfying result?
[cpp]templateThis is the only implementation I could come up with that works:class tbb_future { public: tbb_future(T (*f)()) { tbb::task*t = new(tbb::task::allocate_root())tbb::empty_task; root = t; root->set_ref_count(2); tbb::task*future = new(root->allocate_child())future_task(f,&value); root->spawn(*future); } T get_value() { if(root) { tbb::spin_mutex::scoped_lock lock(myMutex); if(root) { root->wait_for_all(); root = 0; lock.release(); } } return value; } private: tbb::atomic<:TASK>root; T value; tbb::spin_mutex myMutex; class future_task : public tbb::task { public: future_task(T (*f)(),T*val):func(f),value(val) { } virtual tbb::task*execute() { *value = func(); return 0; } virtual ~future_task(){} private: T(*func)(); T*value; }; friend future_task; };[/cpp]
[cpp]templateclass tbb_future { public: tbb_future(T (*f)()) { done = false; tbb::task*future = new(tbb::task::allocate_root())future_task(f,&queue); future->spawn(*future); } T get_value() { if(done) return value; T t; queue.pop(t); if(!done) { value = t; done = true; } queue.push(t); return value; } private: tbb::concurrent_queue queue; tbb::atomic done; T value; class future_task : public tbb::task { public: future_task(T (*f)(),tbb::concurrent_queue *q):func(f),queue(q) { } virtual tbb::task*execute() { queue->push(func()); return 0; } virtual ~future_task(){} private: T(*func)(); tbb::concurrent_queue *queue; }; friend future_task; };[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The first implementation looks like what I had in mind. I would not use a spin_mutex, though, because it is assumed that the future is heavy enough that a task-based future will be beneficial, so tbb::mutex seems more appropriate. It looks that root is leaked just after wait_for_all(); lock.release() is merely a distraction. What happens if you destroy root afterwards, or instead use spawn_root_and_wait_for_all(root)? I haven't looked at the second implementation yet.
It also occurred to me (only just now, sorry for that) that the code might be calling get_value() at a greater depth than where future_task was spawned, which might cause a deadlock (anyone from the TBB team?). Maybe you can trace some depth values and see whether that provides a clue, before that possibility is explored any further.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We looked at implementing futures in the original version of TBB. The problem was that all the efficient implementations deadlocked for interesting cases where futures used other futures. The root problem is that futures, as typically specified, leave it to the system to figure out an evaluation order, and there are subtle conflicts between "figure it out" and the way task stealing can trap a partially evaluated task on a thread's stack.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We looked at implementing futures in the original version of TBB. The problem was that all the efficient implementations deadlocked for interesting cases where futures used other futures. The root problem is that futures, as typically specified, leave it to the system to figure out an evaluation order, and there are subtle conflicts between "figure it out" and the way task stealing can trap a partially evaluated task on a thread's stack.
Here's a solution that I thought would work, based on the Two Mouths example from the TBB book. But it claims that deadlock is detected if two tasks call get_value(), when I do a tbb_schedule_init(1). And it sometimes deadlocks at wait_for_all() when I do a tbb_schedule_init(2) - and future_task::execute() never executes! Do you know what's going on here? I must have some misconception of TBB... I've appreciated the comments so far. Thanks!
[cpp]templateclass tbb_future { public: tbb_future(T (*f)()) { done = false; future = new(tbb::task::allocate_root())future_task(f,this); future->spawn(*future); } T get_value() { if(!done) { cout << "acquiring" << endl; tbb::spin_mutex::scoped_lock lock(mutex); cout << "acquired" << endl; tbb::task*temp_root = 0; if(!done) { temp_root = new (tbb::task::allocate_root())tbb::empty_task; tbb::empty_task*child = new (temp_root->allocate_child())tbb::empty_task; temp_root->set_ref_count(2); future->add_task(child); } lock.release(); if(temp_root) { //what happens if child task is spawned before this line is run? cout << "waiting" << endl; temp_root->wait_for_all(); } } return value; } private: void finish(const T&v) { tbb::spin_mutex::scoped_lock lock(mutex); value = v; done = true; } class future_task : public tbb::task { public: future_task(T (*f)(),tbb_future *fut):func(f),future(fut) { } virtual tbb::task*execute() { cout << "executing" << endl; future->finish(func()); cout << "finished!" << endl; for(size_t i=0;i < dependent_tasks.size();++i) spawn(*dependent_tasks); return 0; } void add_task(tbb::task*t) { dependent_tasks.push_back(t); } virtual ~future_task(){} private: T(*func)(); tbb_future *future; tbb::concurrent_vector<:TASK>dependent_tasks; }; tbb::atomic done; T value; tbb::spin_mutex mutex; tbb::atomic future; friend future_task; };[/cpp]
Here's now I run it:
[cpp]class get_future : public tbb::task { public: get_future(tbb_future&f):fut(f) { } tbb::task*execute() { cout << "getting value" << endl; cout << "got value: " << fut.get_value() << endl; return 0; } tbb_future &fut; }; int func() { cout << "sleeping" << endl; Sleep(2000); cout << "sleeping" << endl; Sleep(2000); return 4; }
void main()
{
tbb::tbb_scheduler_init init;[/cpp]
[cpp] tbb::task*s; tbb_future}fut(func); s = new (tbb::task::allocate_root())get_future(fut);s->spawn(*s); s = new (tbb::task::allocate_root())get_future(fut);s->spawn(*s); s = new (tbb::task::allocate_root())get_future(fut);s->spawn(*s); s = new (tbb::task::allocate_root())get_future(fut);s->spawn(*s); s = new (tbb::task::allocate_root())get_future(fut);s->spawn(*s); Sleep(100); int val = fut.get_value(); std::cout << endl << val << endl;[/cpp]
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again, I see a task being leaked (temp_root). The name "future" for a pointer to a "future_task" is a bit confusing (I would use m_future_task myself). I didn't think of this possibility for a worker to be able to do useful work while waiting for a resultfuture's value, but I still believe that you may run into problems by pinning all your hope on the future_task, which may be caught up in an unfortunate entanglement (see my earlier suggestion).
(Added) (Meanwhile, I saw and responded to Gunjan Rawal's request.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Again, I see a task being leaked (temp_root). The name "future" for a pointer to a "future_task" is a bit confusing (I would use m_future_task myself). I didn't think of this possibility for a worker to be able to do useful work while waiting for a resultfuture's value, but I still believe that you may run into problems by pinning all your hope on the future_task, which may be caught up in an unfortunate entanglement (see my earlier suggestion).
(Added) (Meanwhile, I saw and responded to Gunjan Rawal's request.)
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you create a future in task A and task A subsequently spawns B which does get_value() before the future_task has started executing, then you'll have a deadlock because the thread that's waiting will not consider the future_task as an option for keeping busy (maybe you'll need some more distance, but you get the picture). This situation would require another worker thread (and TBB is about optional concurrency, so everything should work with just one thread), that isn't caught up in work at a greater depth than future_task either. So there are many ways for this to cause problems, and this seems clear enough now not to need verification anymore.
Hence my suggestion to give a future_task a chance to create concurrency, but to take the assignment away from it if it hasn't started work by get_value() time. It's like a workaround for not being able to dynamically update a directed acyclic graph of dependencies (more than one places in the code may call get_value()) and get the future_task respawned at a more appropriate depth (if that concept then still holds).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Construct a future for each B(i,j) required for the recursive computation of B(n,k). It works out to a rectangular subregion of Pascal's triangle. Don't worry about arithmetic wrap-around. E.g., actually compute B(n,k) mod 2^32.
- Link up the futures so that evaluation of B(i,j) computes 1 if j=0 or i==j. Otherwise compute it as spawning the the futures for B(i-1,j-1) and B(i-1,j) and adding their result. The key point is that there should be only one future created and evaluated for each B(i,j). The dependence graph becomes rectangular grid.
- Start the future for B(n,n/2) for a suitably high value of n.
I recall the hang problem was related to the way that a thread working on afuture X could end up waiting for another future Y to complete, and to keep itself busy, steal another future Z that ended up needing the result of X.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is not a situation I would have considered. But doesn't TBB preclude stealing tasks at the same depth or shallower? So wouldn't the computation proceed in the order dictated by the order of spawning, except for a window the size of the number of workers, and wouldn't this either result in all dependencies being resolved or soon resolved, or hopelessly stuck waiting for futures that have not started executing, in which case my suggestion could come to the rescue? (Sorry if there any obvious holes in this reasoning, I'm stuck in fact-finding mode for now.)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you create a future in task A and task A subsequently spawns B which does get_value() before the future_task has started executing, then you'll have a deadlock because the thread that's waiting will not consider the future_task as an option for keeping busy (maybe you'll need some more distance, but you get the picture). This situation would require another worker thread (and TBB is about optional concurrency, so everything should work with just one thread), that isn't caught up in work at a greater depth than future_task either. So there are many ways for this to cause problems, and this seems clear enough now not to need verification anymore.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is not a situation I would have considered. But doesn't TBB preclude stealing tasks at the same depth or shallower? So wouldn't the computation proceed in the order dictated by the order of spawning, except for a window the size of the number of workers, and wouldn't this either result in all dependencies being resolved or soon resolved, or hopelessly stuck waiting for futures that have not started executing, in which case my suggestion could come to the rescue? (Sorry if there any obvious holes in this reasoning, I'm stuck in fact-finding mode for now.)
[cpp]templateclass tbb_future { public: tbb_future(T (*f)()):func(f) { started = false; done = false; my_future = new(tbb::task::allocate_root())future_task(f,this); my_future->spawn(*my_future); } T get_value() { if(!done) { tbb::spin_mutex::scoped_lock lock(mutex); tbb::task*temp_root = 0; if(!done) { if(started) { temp_root = new (tbb::task::allocate_root())tbb::empty_task; tbb::empty_task*child = new (temp_root->allocate_child())tbb::empty_task; temp_root->set_ref_count(2); my_future->add_task(child); lock.release(); } //it's never even started, but may have a different owner and never be able to. //recreate a new task and wait on it else { my_future = new(tbb::task::allocate_root())future_task(func,this); lock.release(); tbb::task::spawn_root_and_wait(*my_future); } } if(temp_root) { //what happens if child task is spawned before this line is run? temp_root->wait_for_all(); temp_root->destroy(*temp_root); } } return value; } private: bool start() { tbb::spin_mutex::scoped_lock lock(mutex); bool ret = started; started = true; return ret; } void finish(const T&v) { tbb::spin_mutex::scoped_lock lock(mutex); value = v; done = true; } class future_task : public tbb::task { public: future_task(T (*f)(),tbb_future *fut):func(f),my_future(fut) { } virtual tbb::task*execute() { //if there's another task that already did / is doing this computation, just return if(!my_future->start()) { my_future->finish(func()); for(size_t i=0;i < dependent_tasks.size();++i) spawn(*dependent_tasks); } return 0; } void add_task(tbb::task*t) { dependent_tasks.push_back(t); } virtual ~future_task(){} private: T(*func)(); tbb_future *my_future; tbb::concurrent_vector<:TASK>dependent_tasks; }; tbb::atomic done; tbb::atomic started; T value; tbb::spin_mutex mutex; tbb::atomic my_future; T(*func)(); friend future_task; };[/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How about direct invocation instead of the extra future_task in get_value()
So this future class is guarenteed to work except when it depends on another future class, where it could (due to task stealing) be stuck waiting for itself? This seems to be a very subtle potential for race conditions in a task stealing scheduler that I would never have considered...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"Hmmm, good suggestion, but I think this is necessary so that subsequent callers of get_value() can add a dependent_task to know when the computation is done." Wouldn't that still work if you directly call execute()?
"stuck waiting for itself" I don't know... The example provided by Arch didn't seem to apply to TBB as it is now, or does it? Maybe it's not too difficult to formulate and follow some workable rules to make a future a useful Threading Building Block?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"stuck waiting for itself" I don't know... The example provided by Arch didn't seem to apply to TBB as it is now, or does it? Maybe it's not too difficult to formulate and follow some workable rules to make a future a useful Threading Building Block? this is necessary so that subsequent callers of get_value() can add a dependent_task to know when the computation is done.
I really don't know whether it will work with TBB as it is now... that's why I was hoping to find out from the forum; but it's also important that it works as TBB will be...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I really don't know whether it will work with TBB as it is now... that's why I was hoping to find out from the forum; but it's also important that it works as TBB will be...
Mike
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"So it must be possible to implement futures with a work stealing scheduler." Elementary, my dear Henson! Surely everybody knows the Coffman conditions for deadlock... well, mutual exclusion corresponds to get_value() waiting until somebody else comes up with the answer, so if instead you just go ahead anyway (or wait only after verifying that some other code is actually working on the future), not caring that you might be doing redundant work or sitting idle, you will avoid inducing deadlock.
"What we really need is a way to create a task that is guarenteed not to steal..." Not really: TBB will only steal deeper tasks, so that takes care of one of the other Coffman conditions.
Well, I'm not 100% sure yet, but it seems plausible.
P.S.: Actually, Holmes never said that.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page