OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

How to use OpenCL 2.0 atomics

cllh80
Beginner
1,322 Views

Hi all,

  I am compiling OpenCL programs on Intel CPU and the integrated GPU. The processor is i9-9700K.  The driver is 26.20.xx.xx and system_studio_2020 is installed.

  OpenCL 2.0 atomics such as atomic_compare_exchange_strong_explicit are used and the "-cl-std=CL2.0" flag is provided in clBuildProgram.

  However, the function clBuildProgram returns -11 (CL_BUILD_PROGRAM_FAILURE ) and the information is "implicit declaration of function 'atomic_compare_exchange_strong_explicit' is invalid in OpenCL". 

  How to use 2.0 atomics? Or they are not supported by Intel?

  Thanks a lot.

Labels (1)
0 Kudos
2 Replies
Ben_A_Intel
Employee
1,313 Views

Hello, and apologies for the slow reply.

Both the CPU and GPU OpenCL devices support OpenCL 2.0 atomics.

A few things to try:

  • Can you check your program build log to see if it provides any additional information?
  • Can you try compiling for the CPU and GPU devices separately?  Are both devices failing, or just one?
  • Can you try a simpler kernel?  Here is a simple example that successfully compiles through Clang: https://godbolt.org/z/dnbTco5ca

Thanks!

0 Kudos
cllh80
Beginner
1,300 Views

Thank you for your reply.

Now we can use OpenCL 2.0 atomics on both the CPU and the GPU. However a new problem has occured. 

It is like this:

We run a kernel on the GPU and the kernel operates on an Integer array (where each value is zero initially) using OpenCL 2.0 atomics (atomic_compare_exchange_strong_explicit) to set element values. Finanlly, in the kernel we calculate the number of keys whose values are not zero and could get the right result. But if the program comes back to the host, the number of keys is calculated again and the correct number could not be acquired.  The data structures are allocated by clSVMAlloc and "CL_MEM_READ_WRITE | CL_MEM_SVM_FINE_GRAIN_BUFFER | CL_MEM_SVM_ATOMICS" are used.

So if an array is set by the GPU, could its values be seen correctly by the CPU?

Thank you!

 

0 Kudos
Reply