Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Mixing OpenMP and Intel TBB

dgeld
Beginner
1,026 Views
Hello,

I have tried to mix OpenMP and Intel TBB.

I first have a loop which I precede by a

#pragma omp parallel for

After, I have code using tbb::parallel_reduce

When OpenMP is enabled, the tbb::parallel_reduce is slower.

I understand that there could be competion between OpenMP threads and TBB threads.

Today my pseudo code looks like :
tbb::task_scheduler_init
#pragma omp parallel for
tbb::parallel_reduce

Should I init TBB later ? how I could shutdown OpenMP before TBB init ?

Thanks in advance.
0 Kudos
5 Replies
Alexey-Kukanov
Employee
1,026 Views

DidI understand it right thatyour tbb::parallel_reduce loop follows the omp parallel region (as opposed to nesting into it)? It's not clear from the pseudo-code above.

If you are using Intel Compiler and its OpenMP implementation, be aware that OpenMP worker threads spin for some time after the end of a parallel region before going asleep. This spin time is controlled by the KMP_BLOCKTIME environment variable. As you run a tbb parallel loop right after that, I recommend you set the KMP_BLOCKTIME to 0, and it will make OpenMP worker threads sleeping immediately after the region. See the compiler documentation for more details about KMP_BLOCKTIME. Besides the environment variable, you could control the setting with kmp_set_blocktime() and kmp_get_blocktime() calls; you should set it to 0 before entering the omp parallel region. Again, these functions and the environment variable are Intel Compiler specific.

0 Kudos
dgeld
Beginner
1,026 Views

MADakukanov:

DidI understand it right thatyour tbb::parallel_reduce loop follows the omp parallel region (as opposed to nesting into it)?



Yes tbb::parallel_reduce follows the omp parallel region

MADakukanov:

I recommend you set the KMP_BLOCKTIME to 0, and it will make OpenMP worker threads sleeping immediately after the region.
...
Again, these functions and the environment variable are Intel Compiler specific.



OK, this works fine.

But the conclusion is that TBB is more "portable" than OpenMP.

Thanks for your answer.
0 Kudos
dgeld
Beginner
1,026 Views

MADakukanov:

DidI understand it right thatyour tbb::parallel_reduce loop follows the omp parallel region (as opposed to nesting into it)?



Yes tbb::parallel_reduce follows the omp parallel region

MADakukanov:

I recommend you set the KMP_BLOCKTIME to 0, and it will make OpenMP worker threads sleeping immediately after the region.
...
Again, these functions and the environment variable are Intel Compiler specific.



OK, this works fine.

But the conclusion is that TBB is more "portable" than OpenMP.

Thanks for your answer.
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,026 Views

>> But the conclusion is that TBB is more "portable" than OpenMP.

No, the conclusion is: one must be careful when mixing multiple threading domains in a single application.

In your application, both TBB and OpentMP work under the guideline that they create and manage their own thread pools. An application could additionally contain pthreads and thus potentially have three thread domains to worry about (or ignore as the case may be).

Jim Dempsey

0 Kudos
dgeld
Beginner
1,026 Views
Hello Jim,

I just wanted to say that not all OpenMP implementation gives you a fine grained control of what is going on.

TBB gives me the impression (perhaps false) of a better control of threads, synchronization, ...

It is in this sense, that I found it more "portable" across compiler. With OpenMP, you will have to adapt the way you use it to your compiler's implementation.

David Geldreich
0 Kudos
Reply