OpenCL* for CPU
Ask questions and share information on Intel® SDK for OpenCL™ Applications and OpenCL™ implementations for Intel® CPU
Announcements
This forum covers OpenCL* for CPU only. OpenCL* for GPU questions can be asked in the GPU Compute Software forum. Intel® FPGA SDK for OpenCL™ questions can be ask in the FPGA Intel® High Level Design forum.
1663 Discussions

Initial release feedback on CL 2013

Kloeckner__Andreas
158 Views

Hi there,

First of all, thanks for releasing a new version of the Intel OpenCL toolkit! I've downloaded the new version and have found two issues:

  • First, the compiler is much slower than it used to be, say, on 2012. This is not bad in itself. What makes it bad is that compiling from binary appears to take the same amount as from source. As a result, developers are stuck waiting for kernels to compile every time. (i.e. binary caching is impossible) This gets old quickly...
  • The PyOpenCL (in git, http://github.com/inducer/pyopencl ) test suite fails in the segmented scan. Since this code runs successfully on AMD (CPU, APU, GPU), Nvidia, and Intel 2012, I am currently leaning towards there being a correctness issue in 2013. I'll continue to investigate though.

I'd appreciate your feedback on these issues.

Thanks!

Andreas

0 Kudos
6 Replies
Raghupathi_M_Intel
158 Views

inducer wrote:

  • First, the compiler is much slower than it used to be, say, on 2012. This is not bad in itself. What makes it bad is that compiling from binary appears to take the same amount as from source. As a result, developers are stuck waiting for kernels to compile every time. (i.e. binary caching is impossible) This gets old quickly...
  • The PyOpenCL (in git, http://github.com/inducer/pyopencl ) test suite fails in the segmented scan. Since this code runs successfully on AMD (CPU, APU, GPU), Nvidia, and Intel 2012, I am currently leaning towards there being a correctness issue in 2013. I'll continue to investigate though.

Hi Andreas,

For the first issue, was this on the CPU device or GPU? Can you send us a reproducer? The second issue - I haven't used PyOpenCL, what are the steps to reproduce the failure? Again, is this on CPU or GPU?

Thanks,
Raghu

Kloeckner__Andreas
158 Views

Hi Raghu,

here's a reproducer for the compile speed issue. Btw, I'm on Linux, and all my complaints pertain to the CPU backend. On my i7 2620 (SNB), these are the numbers I get for the attached code:

Intel CL 2013:

from-source compile took 3.5429 s
from-binary compile took 3.71068 s

Intel CL 2012:

from-source compile took 0.197583 s
from-binary compile took 0.123464 s

As you can see, these times are worse by more than a factor of 10. To reproduce this, simply run the attached file "compile-times.py" using PyOpenCL. If you'd like to reproduce this independently of PyOpenCL, you'll also need the header "pyopencl-ranluxcl.cl" which I've also included.

Thanks!

Andreas

Kloeckner__Andreas
158 Views

Bump?

Yuri_K_Intel
Employee
158 Views
Hi Andreas, Regarding compilation times. I get similar values on 2013 release (~3.8 s), but latest internal version are a lot faster (~0.4 s). I didn't reproduce 2012 version values, but I think the difference might be explained by various compiler changes, optimizations, etc. I will look for segmented scan test failures later. Thanks, Yuri
Kloeckner__Andreas
158 Views

Great, thanks. Let me know if I can help somehow.

JLuna5
New Contributor I
158 Views

Hi, the feedback is very important for improve our apps.

Reply