- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I am facing an issue when trying to "detach" a TBB task from a tbb::parallel_for_each.
Here is a simplified example showing this behavior: it basically runs a tbb::parallel_for_each, each object A inside the parallel for spawns a task that sleeps for 10 seconds.
class A {
public:
~A() {
_task.wait();
}
void Run() {
_task.run(
[]() { std::this_thread::sleep_for(std::chrono::seconds(10)); });
}
private:
tbb::task_group _task;
};
std::vector<A> a(5);
std::cout << "Starting parallel_for_each at " << GetDate() << std::endl;
tbb::parallel_for_each(a.begin(), a.end(), [](A& a_) {
a_.Run();
});
std::cout << "parallel_for_each done at " << GetDate() << std::endl;
The output is like this (second cout is printed 10 seconds after the first one):
Starting parallel_for_each at Thu Jun 24 16:50:53 2021
parallel_for_each done at Thu Jun 24 16:51:03 2021
The tbb::parallel_for_each seems to be waiting for the TBB tasks to finish before giving back control to the main thread.
Is there a way to "detach" the TBB task from the parallel_for_each so that the program can keep running while the tasks from class A keep executing in parallel?
Thanks.
Edit: I am using TBB 2019 U3 on Cent OS 6.6
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I will start answer with original question.
1) Why does parallel_for_each block until all tasks are completed, meanwhile task run under task_group?
It's happened because when user thread calling blocking API that will wait until work is completed (e.g. parallel_for_each, task_group::wait, etc.), there is no guaranty that user thread will execute only tasks that related to current context(until you use isolation). In your example user thread that has called parallel_for_each takes tasks from task_group and that's why you observe this behavior.
2) Why does task_arena help in this case?
When user thread has called parallel_for_each the work has submitted to implicit arena of user thread. And when you run task_group tasks explicitly with task_group::execute the work has submitted to different explicit arena.
So, in your example we have two arenas implicit arena of user thread and explicit task_arena.
implicit arena of user thread contains tasks of parallel_for_each.
explicit task_arena contains tasks of task_group.
In oneTBB user thread can't execute tasks from different arenas, that why last code sample works us you expected.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
We are working on your issue and we will get back to you soon!
Thanks & Regards,
Santosh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Unfortunately, no. Please see:
https://docs.oneapi.com/versions/latest/onetbb/tbb_userguide/Cook_Until_Done_parallel_do.html
The instance of parallel_for_each does not terminate until all items have been processed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi.
That is what I do not understand.
The sleep is launched through a separate task_group (line 8).
Isn't that supposed to be non blocking, and thus let the parallel_for_each continue without waiting?
Or is there another way to achieve that behavior?
In the real code, the intent is to have some general processing happen in a parallel_for (that will be fun several times throughout the program execution), and non blocking tasks being triggered when some conditions are met.
Thanks for the feedback.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
It should be a barrier at the end of parallel_for_each. Are you trying to avoid this?
Also, have you looked at TBB Flow Graph, e.g,
However, even in case of TBB Flow Graph, it is recommended to use wait_for_all()
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please also look at async Flow Graph nodes
https://link.springer.com/chapter/10.1007/978-1-4842-4398-5_18
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, exactly, I am trying to avoid the barrier at the end of the parallel_for since the task that is spawned is independent from the rest of the processing done in the parallel_for.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found out that, for some reason, spawning the task from a task_arena gives the behavior that I expect.
The parallel_for gives back control to the main immediately (I see both prints happen without a 10 seconds gap in between), and then the program waits for the task_group to complete.
The documentation did not help me much in understanding why the task_arena worked in that case.
class A {
public:
~A() {
_task.wait();
}
void Run() {
_arena.execute([&]() {
_task.run(
[]() { std::this_thread::sleep_for(std::chrono::seconds(10)); });
};
}
private:
static tbb::task_arena _arena;
tbb::task_group _task;
};
std::vector<A> a(5);
std::cout << "Starting parallel_for_each at " << GetDate() << std::endl;
tbb::parallel_for_each(a.begin(), a.end(), [](A& a_) {
a_.Run();
});
std::cout << "parallel_for_each done at " << GetDate() << std::endl;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I will start answer with original question.
1) Why does parallel_for_each block until all tasks are completed, meanwhile task run under task_group?
It's happened because when user thread calling blocking API that will wait until work is completed (e.g. parallel_for_each, task_group::wait, etc.), there is no guaranty that user thread will execute only tasks that related to current context(until you use isolation). In your example user thread that has called parallel_for_each takes tasks from task_group and that's why you observe this behavior.
2) Why does task_arena help in this case?
When user thread has called parallel_for_each the work has submitted to implicit arena of user thread. And when you run task_group tasks explicitly with task_group::execute the work has submitted to different explicit arena.
So, in your example we have two arenas implicit arena of user thread and explicit task_arena.
implicit arena of user thread contains tasks of parallel_for_each.
explicit task_arena contains tasks of task_group.
In oneTBB user thread can't execute tasks from different arenas, that why last code sample works us you expected.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks everyone for the support!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Another way two solve this problem.
The documentation about tbb::this_task_arena::enqueue, please check.
#include <thread>
#include <vector>
#include <iostream>
#include <chrono>
#include <atomic>
#define TBB_PREVIEW_TASK_GROUP_EXTENSIONS 1
#include <oneapi/tbb/task_group.h>
#include <oneapi/tbb/task_arena.h>
#include <oneapi/tbb/parallel_for_each.h>
class A {
public:
~A() {
_task.wait();
}
void Run() {
_task.run(
[]() { std::this_thread::sleep_for(std::chrono::seconds(10)); });
}
private:
tbb::task_group _task;
};
int main() {
std::vector<A> a(5);
std::atomic<bool> all_task_submitted{false};
auto t1 = std::chrono::high_resolution_clock::now();
tbb::this_task_arena::enqueue([&] {
tbb::parallel_for_each(a.begin(), a.end(), [](A& a_) {
a_.Run();
});
all_task_submitted = true;
});
auto t2 = std::chrono::high_resolution_clock::now();
std::cout << "Spawn tasks duration " << std::chrono::duration_cast<std::chrono::seconds>(t2 - t1).count() << " sec" << std::endl;
// Wait until all tasks are submitted
while (!all_task_submitted) { std::this_thread::yield(); }
// A::~A() will wait for all tasks completion
}

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page