Same instruction on all 8 EU?

Biren_Doshi — Thu, 08 Dec 2016 07:46:12 GMT

To get peak performance, all EU in single sub-slice should issue same instruction or in single EU only we need same instruction? At what granularity i should avoid branching ?

Thanks and regards,

Biren Doshi

The conditional mask is by EU

Jeffrey_M_Intel1 — Fri, 09 Dec 2016 09:51:55 GMT

The conditional mask is by EU thread. Each thread can have 1-32 SIMD lanes.

This is lower granularity than by EU. Each EU typically runs 7 threads. The 2 FPUs per EU could in theory be saturated by only 2 threads but in practice running 7 means a higher chance of keeping them busy.

For more info, please see section 5.3.5 "SIMD Code Generation for SPMD Programming Models" in the Gen9 compute architecture documentation: https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf.

topic The conditional mask is by EU in OpenCL* for CPU

Same instruction on all 8 EU?

The conditional mask is by EU