OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU.
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1719 Discussions

OpenCL compiler problems

EvgeniyPeshkov
New Contributor I
687 Views
Hello, everybody!
I have very strange and serious problems with OpenCL SDK 2012 compiler (Windows 7 32-bit platform).
At earlier step of development everything was fine. But ones when i added new kenel to my openCL program source i've got a lot of odd errors. For example some very simple kernels return completely wrong results, or fatal error that crashes program can occur, even if i do not use new kernel, and even i do not create it with clCreateKernel.
When I'm switching off autovectorization by usingvec_type_hint in one or more kernels (not necessarily new one)in new source code everything is fine again. Also I can remove some old kernels from the source and new kernel with all remainded work correctly. Complete OpenCL program works fine in debug mode (-g compiler option) too. By the way everything ok on amd and nvidia platforms in both windows and linux.
It looks like there is some kind of limitation of vectorized kernels numberin OpenCL program but i don't believe that. Maybe when there is too much such kernels compiler behaves unexpectedly or something similar.
Did anyone face such problem or have any idea how to solve it?
0 Kudos
7 Replies
Raghupathi_M_Intel
687 Views
Let me try to understand what you are saying. You were able to build and run kernels fine until you added a new kernel and everything broke? Can you give us a small test case so that we can try to reproduce the behavior on our end?

Thanks,
Raghu
0 Kudos
EvgeniyPeshkov
New Contributor I
687 Views
Thank you for your attention.
I've written sample that demonstrates problem. There are one simple kernel evenBytes that just takes even bytes from memory buffer and puts them into another one and number of not used dummy kernels kernelX (they are completely identical but have different names). kernelX kernels aresuccessfullyvectorized during build.
If you run this sample it will produce wrong results. BUT if you remove or comment one of kernelX kernel, everything will be fine. Also you can just add vec_type_hint attribute (it prevents autovectorization) to one of kernelX kernel and result will be correct too.
As an additional information that maybe can help, I use previous generation Core i5 750 processor (SSE 4.1 instruction set).
Sample host and device code with build are in attachment.
0 Kudos
Raghupathi_M_Intel
687 Views
Thanks for the test app. I was able to reproduce the issue you were seeing. I'd have to debug this and will get back to you with my findings.

Thanks,
Raghu
0 Kudos
EvgeniyPeshkov
New Contributor I
687 Views
Hello Raghu!
I've discovered some more information about this bug. Maybe it can help fix the problem.In this sample if switch work-group size from 960 to 480 (global work-size is evenly divisible by both of them) everything works correctly with all kernels and autovectorization enabled. So, maybe problem is not on compiler side but in OpenCL runtime, or maybe in both.
0 Kudos
EvgeniyPeshkov
New Contributor I
687 Views
HelloRaghu!
I've discovered some more information about this bug. Maybe it can help fix the problem. In this sample if switch work-group size from 960 to 480 (global work-size is evenly divisible by both of them) everything works correctly with all kernels and autovectorization enabled. So, maybe problem is not on compiler side but in OpenCL runtime, or maybe in both.
0 Kudos
Raghupathi_M_Intel
687 Views
Hi pesh,

Thanks for your efforts. I filed a bug against the compiler/runtime. I'll keep you posted if I get any updates from our team.

Thanks,
Raghu
0 Kudos
Mikhail_Smirnov
Beginner
687 Views
Hi,
Do you happen to know if this error only for 32-bit or for 64-bit also?
0 Kudos
Reply