- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm finding it strange that TBB 3.0 is only using three worker threads on my quadcore machine.
task_scheduler_init::default_num_threads() is returning 4, which I take to mean is one main thread plus three worker threads. My main thread, however, only sets things up and then goes into a wait state while the work is done on the remaining threads. There is no way I can change this, as my application is a C# GUI with its own event & delegate system and I cannot apply the GUI Thread design pattern (at least not easily, I think).
I tried calling task_scheduler_init(5) and indeed four worker threads are created but the fourth one is never used, being permanently stuck in thread_monitor::commit_wait.
If my memory serves me well I did not have the same problem with TBB 2.2 and the same application I have now, with all my four cores nicely maxed out.
Can anyone elucidate me on what might be happening here?
Thank you,
manuel
PS: I forgot to mention that I am also using a couple of std::threads for tasks that mostly sit waiting except for occasional flurries of activity. I wonder if the task scheduler is getting confused with this combination of worker threads and std::tasks.
I'm finding it strange that TBB 3.0 is only using three worker threads on my quadcore machine.
task_scheduler_init::default_num_threads() is returning 4, which I take to mean is one main thread plus three worker threads. My main thread, however, only sets things up and then goes into a wait state while the work is done on the remaining threads. There is no way I can change this, as my application is a C# GUI with its own event & delegate system and I cannot apply the GUI Thread design pattern (at least not easily, I think).
I tried calling task_scheduler_init(5) and indeed four worker threads are created but the fourth one is never used, being permanently stuck in thread_monitor::commit_wait.
If my memory serves me well I did not have the same problem with TBB 2.2 and the same application I have now, with all my four cores nicely maxed out.
Can anyone elucidate me on what might be happening here?
Thank you,
manuel
PS: I forgot to mention that I am also using a couple of std::threads for tasks that mostly sit waiting except for occasional flurries of activity. I wonder if the task scheduler is getting confused with this combination of worker threads and std::tasks.
Link Copied
10 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What OS are you on?
Using task_scheduler_init(5) should have worked to keep all four cores busy. I'll see if I can reproduce the problem. TBB 3.0 should have the same behavior as TBB 2.2 on this point, though the underlying implementation changed radically so its possible a change was introduced accidentally.
Using task_scheduler_init(5) should have worked to keep all four cores busy. I'll see if I can reproduce the problem. TBB 3.0 should have the same behavior as TBB 2.2 on this point, though the underlying implementation changed radically so its possible a change was introduced accidentally.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm on Windows Vista 64bit, although I'm compiling a 32bit application.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you build the "irml". E.g., "irml.dll" or "irml_debug.dll". There's a known bug that if either of those files are in your path, you will never get more than P-1 threads. The reason is that those files are "Resource Managers" that dutifully prevent oversubscription, even if you ask for it (that's the known bug). If those files are not in your path, asking for P+1 threads should work.
It worked for me on XP and Vista. Below is the program that I used to check. It will hang if the TBB scheduler does not deliver at least n worker threads.
It worked for me on XP and Vista. Below is the program that I used to check. It will hang if the TBB scheduler does not deliver at least n worker threads.
[cpp]#include "tbb/tbb.h" #includeusing namespace tbb; atomic barrier; struct WaitFunctor { void operator()( blocked_range r ) const { --barrier; // Wait until all threads reach the barrier. // Using a barrier like this in TBB is very bad style, because code should not // depend upon the TBB scheduler delivering a specific number of threads. while( barrier!=0 ) continue; } }; int main() { int n = task_scheduler_init::default_num_threads(); std::printf("n=%dn",n); task_scheduler_init( n+1 ); barrier = n; parallel_for( blocked_range (0,n), WaitFunctor(), simple_partitioner() ); std::printf("donen"); return 0; } [/cpp]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, I'm not compiling irml.
Thank you for your effort. It must be something that I am doing wrong.
I'll keep looking into it and I'll come back to the forum if I find something that is worth reporting.
Cheers,
manuel
Thank you for your effort. It must be something that I am doing wrong.
I'll keep looking into it and I'll come back to the forum if I find something that is worth reporting.
Cheers,
manuel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello again,
Could you please try the following code:
#include
#include
#include
#include
#include
using namespace std;
using namespace tbb;
int my_barrier;
mutex my_mutex;
condition_variable my_cond;
struct CountTask : public task
{
virtual task* execute()
{
{
lock_guard lock(my_mutex);
printf("Got into thread %d\n", my_barrier--);
my_cond.notify_one();
}
// Causes all available threads to fill up by spinning
while (true);
return 0;
}
};
int main() {
int n = task_scheduler_init::default_num_threads();
printf("n=%d\n",n);
task_scheduler_init( n+1 );
my_barrier = n;
for (int i = 0; i < n; ++i)
task::enqueue(*new (task::allocate_root()) CountTask);
{
unique_lock lock(my_mutex);
while (my_barrier > 0)
my_cond.wait(lock);
}
printf("done\n");
return 0;
}
I tried this on an Intel i7 quadcore with 8 logical cores due to HT. default_num_threads() correctly returns 8. I then try to create 9 threads on the scheduler. The code above should work with the CountTask tasks running on the eight worker threads and the main() routine running on the main thread. The output however is:
n=8
Got into thread 8
Got into thread 7
Got into thread 6
Got into thread 5
Got into thread 4
Got into thread 3
Got into thread 2
So, TBB is only allocating 8 threads total (I also confirmed this with a debugger) and is then left hanged because the eight CountTask has no thread left to run on.
Cheers,
manuel
Could you please try the following code:
#include
#include
#include
#include
#include
using namespace std;
using namespace tbb;
int my_barrier;
mutex my_mutex;
condition_variable my_cond;
struct CountTask : public task
{
virtual task* execute()
{
{
lock_guard
printf("Got into thread %d\n", my_barrier--);
my_cond.notify_one();
}
// Causes all available threads to fill up by spinning
while (true);
return 0;
}
};
int main() {
int n = task_scheduler_init::default_num_threads();
printf("n=%d\n",n);
task_scheduler_init( n+1 );
my_barrier = n;
for (int i = 0; i < n; ++i)
task::enqueue(*new (task::allocate_root()) CountTask);
{
unique_lock
while (my_barrier > 0)
my_cond.wait(lock);
}
printf("done\n");
return 0;
}
I tried this on an Intel i7 quadcore with 8 logical cores due to HT. default_num_threads() correctly returns 8. I then try to create 9 threads on the scheduler. The code above should work with the CountTask tasks running on the eight worker threads and the main() routine running on the main thread. The output however is:
n=8
Got into thread 8
Got into thread 7
Got into thread 6
Got into thread 5
Got into thread 4
Got into thread 3
Got into thread 2
So, TBB is only allocating 8 threads total (I also confirmed this with a debugger) and is then left hanged because the eight CountTask has no thread left to run on.
Cheers,
manuel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This line:
task_scheduler_init( n+1 );
causes creation of a temporary instance of class task_scheduler_init, followed by its immediate destruction.
What you need is:
task_scheduler_init tbbinit( n+1 );
task_scheduler_init( n+1 );
causes creation of a temporary instance of class task_scheduler_init, followed by its immediate destruction.
What you need is:
task_scheduler_init tbbinit( n+1 );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the example. I can replicate your result and will look into what went wrong.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I posted my previous reply before seeing Alexey's remark. His observation is correct.
I occasionally make the mistake myself with RAII objects.
I occasionally make the mistake myself with RAII objects.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You are right about the tbbinit( n+1 ) statement, as Arch also confirmed.
I think I finally nailed down the problem that I'm having. The following code example should definitely show this:
#include
#include
#include
#include
#include
#include
using namespace std;
using namespace tbb;
int my_barrier;
mutex my_mutex;
condition_variable my_cond;
struct CountTask : public task
{
virtual task* execute()
{
{
lock_guard lock(my_mutex);
printf("Got into thread %d\n", my_barrier--);
my_cond.notify_one();
}
// Causes all available threads to fill up by spinning
while (true);
return 0;
}
};
static void MyFunc(int n)
{
// Everything works if I uncomment the next line
// task_scheduler_init tbbinit( n+1 );
my_barrier = n;
for (int i = 0; i < n; ++i)
task::enqueue(*new (task::allocate_root()) CountTask);
{
unique_lock lock(my_mutex);
while (my_barrier > 0)
my_cond.wait(lock);
}
}
int main() {
int n = task_scheduler_init::default_num_threads();
printf("n=%d\n",n);
task_scheduler_init tbbinit( n+1 );
std::thread thread(MyFunc, n);
thread.join();
printf("done\n");
return 0;
}
I understand now that every std::thread will have its own task scheduler. I was setting n+1 tasks on the main thread but not on the background std::thread. This was not clear to me before (the Reference manual does not mention this in Chapter 13 on Threads).
Cheers,
manuel
I think I finally nailed down the problem that I'm having. The following code example should definitely show this:
#include
#include
#include
#include
#include
#include
using namespace std;
using namespace tbb;
int my_barrier;
mutex my_mutex;
condition_variable my_cond;
struct CountTask : public task
{
virtual task* execute()
{
{
lock_guard
printf("Got into thread %d\n", my_barrier--);
my_cond.notify_one();
}
// Causes all available threads to fill up by spinning
while (true);
return 0;
}
};
static void MyFunc(int n)
{
// Everything works if I uncomment the next line
// task_scheduler_init tbbinit( n+1 );
my_barrier = n;
for (int i = 0; i < n; ++i)
task::enqueue(*new (task::allocate_root()) CountTask);
{
unique_lock
while (my_barrier > 0)
my_cond.wait(lock);
}
}
int main() {
int n = task_scheduler_init::default_num_threads();
printf("n=%d\n",n);
task_scheduler_init tbbinit( n+1 );
std::thread thread(MyFunc, n);
thread.join();
printf("done\n");
return 0;
}
I understand now that every std::thread will have its own task scheduler. I was setting n+1 tasks on the main thread but not on the background std::thread. This was not clear to me before (the Reference manual does not mention this in Chapter 13 on Threads).
Cheers,
manuel
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That explains why you did not see the problem with TBB 2.2. With TBB 2.2, the first thread to initialize the task scheduler determined the number of worker threads for all user-created threads. With TBB 3.0, each user thread can specify the value separately.
I've made a note to myself to clarify this point in the Reference.
I've made a note to myself to clarify this point in the Reference.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page