Intel® oneAPI Threading Building Blocks
Ask questions and share information about adding parallelism to your applications when using this threading library.

TBB and OpenCL

sinedie
Beginner
755 Views
Hello,
I was going through OpenCL and it seems that OpenCL, at least as yet, does not provide locks. It supports atomic operations of a priliminary kind. So I want to know if TBB and OpenCL can be used together. Did anyone try it or know if it can be done at all?
TIA,
-S
0 Kudos
1 Solution
Andrey_Marochko
New Contributor III
755 Views
If you are asking if it is safe to use TBB synchronization primitives (mutexes, cond var, etc.), then the anwer is yes (provided your locking startegy is safe by itself :) ).

If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.

It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).

However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .

Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.

View solution in original post

0 Kudos
2 Replies
Andrey_Marochko
New Contributor III
756 Views
If you are asking if it is safe to use TBB synchronization primitives (mutexes, cond var, etc.), then the anwer is yes (provided your locking startegy is safe by itself :) ).

If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.

It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).

However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .

Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.
0 Kudos
sinedie
Beginner
755 Views
Thanks Andrey for the quick reply. :)

From the little I have seen of OpenCL and my working with TBB, the former again looks like a step backward. But now that is supporting GPGPU, I must be able to use OpenCL. I really can't understand why would they believe everything should be doable with lockfree approach. And it is annoying that even their CPU implementation (meant for task parallelization) does not seem to support it. That's where I may need to use TBB.

I really must start praying Almighty that you guys somehow come up with a version or a wrapper for TBB that can use GPGPUs too. You may say it is impossible but that's what He is for. ;)

-S
0 Kudos
Reply