Is it possible to force compiler to map one thread instead of eight threads per VXE?

GPU Compute Software

Ask questions about Intel® Graphics Compute software technologies, such as OpenCL* GPU driver and oneAPI Level Zero

Is it possible to force compiler to map one thread instead of eight threads per VXE?

424 Views

"oneAPI GPU Optimization Guide" reads:

"Each VE is a multi-threaded SIMD processor. The compiler generates SIMD code to map several work-items to be executed simultaneously within a given hardware thread. The SIMD-width for a kernel is a heuristic driven compiler choice. Common SIMD-width examples are SIMD-8, SIMD-16, and SIMD-32."

In case of a highly divergent kernel, is it possible to force the compiler to SIMD-1, i.e. only one work item using the whole XVE SIMD16 engine at a time? The reason would be to avoid exponential overhead when each work item's control flow is different.

Link Copied

0 Replies

Community support is provided Monday to Friday. Other contact methods are available here.

Intel does not verify all solutions, including but not limited to any file transfers that may appear in this community. Accordingly, Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

For more complete information about compiler optimizations, see our Optimization Notice.