I trying to use the simple reduction with subgroup as this post .
Only change `reduce<8>` to `reduce<16>`/`reduce<32>`.
However, reduce<32> does not work.
I try it on devcloud with the Intel(R) Gen9 HD Graphics NEO and I check the available subgroup size with clinfo or cl::sycl::info::device::sub_group_sizes.
they give "Sub-group sizes (Intel) 8, 16, 32"
dpcpp is beta09 - Intel(R) oneAPI DPC++ Compiler Pro 2021.1 (2020.8.0.0827)
the following is the full error message
terminate called after throwing an instance of 'cl::sycl::compile_program_error' what(): The program was built for 1 devices Build program log for 'Intel(R) Gen9 HD Graphics NEO': error: Shuffle not supported in SIMD32 error: backend compiler failed build. -17 (CL_LINK_PROGRAM_FAILURE)
Thanks for your patience. The issue raised by you has been fixed in OneAPI 2021.1 version. Please download the latest version (2021.2) and let us know your experience with it.
Thanks for the confirmation.
As your issue has been resolved, we are closing this thread. We will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.