I've been doing a lot of experiments with OpenCL in the last two months or so.
More specifically, I've been using the NOpenCL library ( created by Tunnel Vision Labs ) to perform OpenCL tasks in C# applications, on a low-end portable ( Intel i7-4510U CPU / Intel HD Graphics 4400 + AMD Radeon R7 M260 ).
Being an application developer, most of my work won't fit a SIMD model. However, the performance gains of using the 4400 GPU instead of the CPU ( even when using kernels with several branching points ) are so significant that the issue becomes irrelevant.
Unfortunately, all the OpencCL-related documentation I've read so far is quite elusive about the relationship of its concepts with the specific hardware, and I couldn't find a single piece of information about how Intel chose to implement OpenCL in its GPU line(s). As such, I must say I'm completely blind when I'm preparing the command queues. Are workgroups in some way related with the 4400 20 pipelines ? How do compute units fit in the picture ? By establishing a local work size of 1, am I in some way forcing the use of a single thread inside a compute unit ?
I'd say in SIMD problems this type of questions is probably useless, as long as one follows some general rule about the division of the task size. In any other case, it would perhaps be important to be aware of the penalties involved and the best strategies to minimize them. And to do that, it would be important to understand some OpenCL implementation details on specific chips or architectures.
So, if someone could share one or more links to relevant documentation on these issues, I'd be very grateful.
Heads up... lots of this post is opinion... ask different folks and observe different takes...
are where developers should start with OpenCL™ development anywhere, and in particular for Intel® Graphics Technology. Keep in mind OpenCL™ is really more SPMD as opposed to SIMD. Intel® Graphics Technology does have SIMD facilities that can be used to support OpenCL™, but OpenCL™ provisionings themselves are SPMD.
I hope this helps get you started and thank you for your interest. Good luck. Also... not that it's hard a requirement... please consider our OpenCL™ technology forum for future OpenCL™ specific topics. https://software.intel.com/en-us/forums/opencl. ;We want developer concerns to get the right eyes on them. OpenCL™ implementations do enable our computer vision developer tools so this is a lot of crossover.