I'm using the Intel OpenCL driver on a Debian 7 64-bit, dual E5-2670 machine, programming with the PyOpenCL interface to OpenCL.
When I query available devices on the Intel platform, I see one device named 'Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz'. Using this single OpenCL device, with a large work group size, I see that only one physical CPU is being used.
When I start a second program using the same device, I see that they compete for CPU time instead of being distributed, one on each CPU.
Is this a known limitation of the Intel OpenCL driver, or is there a workaround to use both CPUs?
The machine in question is on a cluster and currently unavailable, but I was able to test the latest driver on a dual E5606 machine, and as before I see a single OpenCL device on the Intel platform, however no "0" in the name, and it appears computation is distributed across both CPUs (2 x 4 cores, no HT, process shows 800% CPU in top).
Is this the intended behavior of Intel's OpenCL driver, in the presence of multiple CPUs, to present them as a single device, and automatically manage distribution of work groups?
I'm not complaining, that makes the programming easier, I'd just like to be sure.
Thanks for the information.
Correct, all CPUs should appear as a single device and all of them should be utilized.
For the machine where you experience the problem would be great if you could provide a reproducer. And let us know whether you use numactl, what the workgroup size is and the total number of work items.