- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
A video processing filter was reported to produce garbage on Intel OpenCL.
After narrowing down the issue, I demonstrated that the chosen 32x8 work size was the culprit. This demo tries the X-Y combinations up to 32x32 and shows that not only 32x8 but other combinations also don't work.
Bug summary:
At (X-Y) 32x16, 32x8, and 32x4, only the first row is processed from each block.
Occurs on
- Intel integrated GPUs (both UHD 750 and 770)
- My environment is Windows 11 Pro, i7-11700 but
community reports come from different systems.
- Intel OpenCL drivers
32.0.101.6078 (2024),
32.0.101.6632 (2025)
and on older ones (unknown since when)
- Intel integrated GPUs (both UHD 750 and 770)
- My environment is Windows 11 Pro, i7-11700 but
community reports come from different systems.
- Intel OpenCL drivers
32.0.101.6078 (2024),
32.0.101.6632 (2025)
and on older ones (unknown since when)
NVIDIA OpenCL seems to be unaffected, not tested for other GPUs.
The problem occurs at specific work sizes: 32x4, 32x8, 32x16
Other notes:
The queue is verified as in-order
The issue persists across GPU generations
The issue persists across driver versions
No synchronization helps (events, clFinish(), etc.)
No global barriers in kernel code help.
Attached:
a c++ command line demo.
The queue is verified as in-order
The issue persists across GPU generations
The issue persists across driver versions
No synchronization helps (events, clFinish(), etc.)
No global barriers in kernel code help.
Attached:
a c++ command line demo.
(Your own OpenCL headers and libs must be provided.)
Thank you in advance.
My output:
Thank you in advance.
My output:
Found Intel UHD GPU: Intel(R) UHD Graphics 750
Using device: Intel(R) UHD Graphics 750
Driver version: 32.0.101.6632
Local work size: [4, 1] Max workgroup size: 512
O.K.
Local work size: [4, 2] Max workgroup size: 512
O.K.
Local work size: [4, 4] Max workgroup size: 512
O.K.
Local work size: [8, 1] Max workgroup size: 512
O.K.
Local work size: [8, 2] Max workgroup size: 512
O.K.
Local work size: [8, 4] Max workgroup size: 512
O.K.
Local work size: [8, 8] Max workgroup size: 512
O.K.
Local work size: [16, 1] Max workgroup size: 512
O.K.
Local work size: [16, 2] Max workgroup size: 512
O.K.
Local work size: [16, 4] Max workgroup size: 512
O.K.
Local work size: [16, 8] Max workgroup size: 512
O.K.
Local work size: [16, 16] Max workgroup size: 512
O.K.
Local work size: [32, 1] Max workgroup size: 512
O.K.
Local work size: [32, 2] Max workgroup size: 512
O.K.
Local work size: [32, 4] Max workgroup size: 512
Error at index 640: expected 646, got 0
Error at index 641: expected 647, got 0
Error at index 642: expected 648, got 0
Error at index 643: expected 649, got 0
Total errors: 230400 out of 307200 elements
Bug. Data mismatch.
Local work size: [32, 8] Max workgroup size: 512
Error at index 640: expected 646, got 0
Error at index 641: expected 647, got 0
Error at index 642: expected 648, got 0
Error at index 643: expected 649, got 0
Total errors: 268800 out of 307200 elements
Bug. Data mismatch.
Local work size: [32, 16] Max workgroup size: 512
Error at index 640: expected 646, got 0
Error at index 641: expected 647, got 0
Error at index 642: expected 648, got 0
Error at index 643: expected 649, got 0
Total errors: 268800 out of 307200 elements
Bug. Data mismatch.
Local work size: [32, 32] Max workgroup size: 512
X-Y size too much, not supported
Ready.
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page