- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
does Intel OpenCL on CPU require consecutive memory accesses of neighboring threads (=in same work group) for vectorization?
I have an hashing-based OpenCL kernel that has mandatory non-consecutive memory accesses (the threads use a calculated hash-value as an memory index, the hashing makes it unpredictable). So far, I'm always getting reported a
"Kernel <kernel_name> was not vectorized"
in the OpenCL build log. I suspect that this is due adjacent threads not accessing consecutive memory addresses. Is that correct? Or can I motivate the Intel OpenCL platform to generate gather/scatter (or intermittent scalar loops) instructions?
A clarification on whether the Intel OpenCL platform can handle this kind of memory access pattern in general would be greatly appreciated.
Link Copied

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page