OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

offline/online opencl compilation difference

Natalia_A_
Beginner
269 Views

Dear Community Members!

I have a program, which makes some calculations with the help of the opencl kernel, which runs on the cpu. I also make these calculations  without opencl and check for errors.

 

The problem is that I get errors from time to time, when I use online compilation of opencl kernel. I don't get errors, when I use offline compilation. 

Could anyone, please, help me and explain, why it happens?

You can find files here:

https://github.com/NataAb/opencl_pr1

 

 

ioc64 -cmd=build -input=prog1.cl -device=cpu -ir=prog1.bc

 for offline comilation.

 

Platform: Intel(R) OpenCL
Vendor: Intel(R) Corporation
Version: OpenCL 1.2 LINUX

CPU: Intel(R) Xeon(R) CPU E5-2609 v2 @ 2.50GHz

OS: Ubuntu 12.04.5 LTS

 

 

0 Kudos
4 Replies
Robert_I_Intel
Employee
269 Views

Hi Natalia,

I tried your program on Windows 8.1 with the latest OpenCL driver and things consistently fail with either off-line or online compilation:

Online compilation:

TEST  started
16  started
i is 0 j is 2 x_1 is -0.010763 ar1 1.60594e+032
16 Compare results failed at step = 5, errors = 1

TEST  started
16  started
i is 12 j is 7 x_1 is -0.0522771 ar1 -0.042029
16 Compare results failed at step = 956, errors = 1

TEST  started
16  started
i is 12 j is 7 x_1 is -0.0513037 ar1 -0.0410118
16 Compare results failed at step = 958, errors = 1


Offline compilation

TEST  started
allowed workgroup size is 4096
16  started
i is 12 j is 7 x_1 is -0.0522771 ar1 -0.0421079
16 Compare results failed at step = 956, errors = 1

TEST  started
allowed workgroup size is 4096
16  started
i is 12 j is 7 x_1 is -0.0513037 ar1 -0.0410526
16 Compare results failed at step = 958, errors = 1

TEST  started
allowed workgroup size is 4096
16  started
i is 0 j is 2 x_1 is -0.000729849 ar1 1.60594e+032
16 Compare results failed at step = 0, errors = 1

 

So I guess the problem is with the code itself.

 

Natalia_A_
Beginner
269 Views

Dear Robert,

thank you for you quick answer!

As you can see, the test fails on some random step, which means, that the kernel works correct in some cases.  May incorrect use of 

barrier( CLK_LOCAL_MEM_FENCE| CLK_GLOBAL_MEM_FENCE);

cause such strange behaviour?

 

Robert_I_Intel
Employee
269 Views

Dear Natalia,

By cursory look at your kernel, looks like barriers are in the right places. Unfortunately, with complex code like this the only option is to use printf all over the place and carefully compare results of intermediate calculations with what you are expecting. The errors don't appear quite so random to me, for example they frequently occur at i = 12, j = 7 and i = 0, j = 2. Also, you get two types of dicrepancies: one looks like an overflow, which always occurs at i=0, j=2, the other looks like an accumulation of error issue. Also note, that the issues are at i=0 and i=12, which are boundaries, so I would carefully check the boundaries of your local arrays and see that you are initializing everything properly.

Natalia_A_
Beginner
269 Views

Dear Robert, I think, I have found the answer. The  error occured due to the following reason:  I did not initialized part of local memory with zeroes. 

 

Thank you for your help. 

Reply