- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
First of all, thanks for releasing a new version of the Intel OpenCL toolkit! I've downloaded the new version and have found two issues:
- First, the compiler is much slower than it used to be, say, on 2012. This is not bad in itself. What makes it bad is that compiling from binary appears to take the same amount as from source. As a result, developers are stuck waiting for kernels to compile every time. (i.e. binary caching is impossible) This gets old quickly...
- The PyOpenCL (in git, http://github.com/inducer/pyopencl ) test suite fails in the segmented scan. Since this code runs successfully on AMD (CPU, APU, GPU), Nvidia, and Intel 2012, I am currently leaning towards there being a correctness issue in 2013. I'll continue to investigate though.
I'd appreciate your feedback on these issues.
Thanks!
Andreas
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
inducer wrote:
- First, the compiler is much slower than it used to be, say, on 2012. This is not bad in itself. What makes it bad is that compiling from binary appears to take the same amount as from source. As a result, developers are stuck waiting for kernels to compile every time. (i.e. binary caching is impossible) This gets old quickly...
- The PyOpenCL (in git, http://github.com/inducer/pyopencl ) test suite fails in the segmented scan. Since this code runs successfully on AMD (CPU, APU, GPU), Nvidia, and Intel 2012, I am currently leaning towards there being a correctness issue in 2013. I'll continue to investigate though.
Hi Andreas,
For the first issue, was this on the CPU device or GPU? Can you send us a reproducer? The second issue - I haven't used PyOpenCL, what are the steps to reproduce the failure? Again, is this on CPU or GPU?
Thanks,
Raghu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raghu,
here's a reproducer for the compile speed issue. Btw, I'm on Linux, and all my complaints pertain to the CPU backend. On my i7 2620 (SNB), these are the numbers I get for the attached code:
Intel CL 2013:
from-source compile took 3.5429 s
from-binary compile took 3.71068 s
Intel CL 2012:
from-source compile took 0.197583 s
from-binary compile took 0.123464 s
As you can see, these times are worse by more than a factor of 10. To reproduce this, simply run the attached file "compile-times.py" using PyOpenCL. If you'd like to reproduce this independently of PyOpenCL, you'll also need the header "pyopencl-ranluxcl.cl" which I've also included.
Thanks!
Andreas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Bump?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Great, thanks. Let me know if I can help somehow.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, the feedback is very important for improve our apps.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page