OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

CPU as OpenCL device running in a sperated process?

Harald_S_
Beginner
453 Views

I run a program (process A) on my Intel Xeon CPU E5-1620 v2 with two threads. One thread (1) starts an OpenCL application, that uses the CPU as device the other (2) does some calculations.

I noticed that the performance of thread 2, suffers from the OpenCL application execution of thread 1.

So I concluded, that the OpenCL application run by thread 1  starts a new process on the CPU (process B) and that process A and B get scheduled by the operating system. Because of this the performance of thread 2 suffers.

I could not find any documentation, that confirms my conclusion.

Is conclusion correct and more important, is there a documentation about it?

 

 

0 Kudos
1 Reply
Jeffrey_M_Intel1
Employee
453 Views

The CPU implementation is automatically parallelized by Intel Threading Building Blocks (TBB).  This is one of the advantages of using it -- you get access to the sophisticated multi-threading capabilities of this rich library for free.

If you run the CapsBasic sample (platform/device capabilities viewer) you will see something like this for your OpenCL CPU implementation:

CL_DEVICE_TYPE_CPU[0]
    CL_DEVICE_NAME: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz
    CL_DEVICE_AVAILABLE: 1
 ...
    CL_DEVICE_MAX_COMPUTE_UNITS: 4

For this processor, it means OpenCL will schedule across the 4 CPU cores by default.

For the CPU implementation it is possible to use only a subset of cores through "device fission". https://software.intel.com/en-us/articles/opencl-device-fission-for-cpu-performance.

Of course another option to have more control over which cores are used is to just move the kernel code into TBB or OpenMP instead. 

 

 

0 Kudos
Reply