- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I was going through OpenCL and it seems that OpenCL, at least as yet, does not provide locks. It supports atomic operations of a priliminary kind. So I want to know if TBB and OpenCL can be used together. Did anyone try it or know if it can be done at all?
TIA,
-S
I was going through OpenCL and it seems that OpenCL, at least as yet, does not provide locks. It supports atomic operations of a priliminary kind. So I want to know if TBB and OpenCL can be used together. Did anyone try it or know if it can be done at all?
TIA,
-S
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are asking if it is safe to use TBB synchronization primitives (mutexes, cond var, etc.), then the anwer is yes (provided your locking startegy is safe by itself :) ).
If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.
It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).
However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .
Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.
If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.
It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).
However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .
Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are asking if it is safe to use TBB synchronization primitives (mutexes, cond var, etc.), then the anwer is yes (provided your locking startegy is safe by itself :) ).
If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.
It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).
However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .
Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.
If you mean using their scheduling capabilities together, then the answer will depend on which OpenCL implementation do you use.
It is normally safe to use different threading frameworks side-by-side (for example from different threads, or when their fork-join constructs are invoked one after another).
However when they are used in nested fashion, situation gets more tricky. First of all, it depends on how internal thread pools are managed by frameworks. E.g. TBB and Cilk use fixed size thread pool. Thus if on say 8 core machine you initialize TBB scheduler with 7 worker therads, and then use TBB parallel algorithm from 8 different threads (e.g. created by OpenCL implementation), you'll never get more than 7 TBB worker threads running. Thus the oversubscription is not bad. But with OpenMP (depending on the settings) you may end up with 56 OpenMP worker threads, which is obviously a way too high oversubscription level :( .
Another potential problem is connected with how parallel runtime manipulates with call stacks. I know just a couple of examples when a parallel framework can migrate parts of call stack between threads. They are Cilk and probably Capriccio. Such stack switches may come as a big surprise to another threading library being used in a nested manner, and result in a crash. But since such techniques require special support from compiler (both Cilk and Capriccio use compiler extensions), OpenCL should be safe in this respect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Andrey for the quick reply. :)
From the little I have seen of OpenCL and my working with TBB, the former again looks like a step backward. But now that is supporting GPGPU, I must be able to use OpenCL. I really can't understand why would they believe everything should be doable with lockfree approach. And it is annoying that even their CPU implementation (meant for task parallelization) does not seem to support it. That's where I may need to use TBB.
I really must start praying Almighty that you guys somehow come up with a version or a wrapper for TBB that can use GPGPUs too. You may say it is impossible but that's what He is for. ;)
-S
From the little I have seen of OpenCL and my working with TBB, the former again looks like a step backward. But now that is supporting GPGPU, I must be able to use OpenCL. I really can't understand why would they believe everything should be doable with lockfree approach. And it is annoying that even their CPU implementation (meant for task parallelization) does not seem to support it. That's where I may need to use TBB.
I really must start praying Almighty that you guys somehow come up with a version or a wrapper for TBB that can use GPGPUs too. You may say it is impossible but that's what He is for. ;)
-S
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page