OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

clenqueueTask uses how many cores??

evk8888
Beginner
226 Views
hello guys,
I am using opencl 1.1 with intel xeon 24 core processor..
When i enqueue a task for execution using clenqueueTask() is it supposed to use a single core for execution???
Thanks
0 Kudos
5 Replies
Evgeny_F_Intel
Employee
226 Views

Hi,

Yes, it will run in a single thread.

According to the OpenCL spec. a task has a global size of (1,1,1), that means single execution item.

Thanks,
Evgeny

evk8888
Beginner
226 Views
Hello,
thanks for your reply.... I have another question..
I do device fission say 4 core machine into 4 sub-devices each with 1 core.
I create different queues according to the sub-devices.
for example
is it possible to use clenqueueReadBuffer() for a cl_mem buffer in the device using the queue of sub-device 1 or 2 or 3 and 4 irrespective of where it was executed..
or it is possible to have a global queue for data transfers to/from the device and separate queues for the sub-devices to executed tasks...
will this work....
thanks a lot...
Evgeny_F_Intel
Employee
226 Views

Theoretically it should.
Be aware that as it's stated in the Release Notes the device fission is experimental and you might have inconsistency in your results.

We would be glad to hear your feedback about experience with this feature.

Jim_Vaughn
Beginner
226 Views
Hi,
If you you create different queues for each sub-device you will have to copy them memory to each "device" giving you seperate memory on each device. Also the clenqueueReadbuffer() takes the queue, program and context so for all intents and purposes they are seperate memory. (right?)
To be fair the spec forcl_ext_device_fission extension is very low on information on what I see as a complex subject.
Evgeny_F_Intel
Employee
226 Views

Thanks for the good question.

In Intel implementation sub-devices share memory resources of the parent device, the exception is NUMA aware systems wherein implementation may try to locate memory objects on the appropriate NUMA node.

Sub-devices are using separate execution units, in the CPU device those are different HW threads.

According to the spec programs shouldbe compiled separately for each sub-device; however, implementation may have single program for all sub-devices of a parent device.

Evgeny

Reply