disable vectorization

Jianbin_F_ · ‎08-08-2013

Hi, I have a question on MIC. That is, when using -O2 (or further -O3), the compiler will vectorize the code automatedly. However, when we further add the -no-vec option, the compiler will not vectorize the code, right? But when looking into the assembly code (*.s), I found that there are still a lot of vector instrustructions there. Then what are the differences between using and not using the vector option? The compiler will vectorize the code anyway?

Jianbin

TimP · ‎08-08-2013

The short story is that scalar floating point is done by using one slot in the mm512 registers, and the basic instruction name is unchanged.

Kevin_D_Intel · ‎08-08-2013

The use of vector instructions is a by-product of the architecture. I expect you are using -mmic. The compiler honors -no-vec when compiling for native. Add a -vec-report option (e.g. -vec-report6) to verify code is vectorized at -O2 or -O3 and then use that same report option with -no-vec to confirm it disabled vectorization.

Here’s some additional resources:

Intel® Xeon Phi™ Coprocessor Developer Zone (On the Overview tab, look for the link to the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual)

Intel® Xeon Phi™ Coprocessor Vector Microarchitecture

Programming and Compiling for Intel® Many Integrated Core Architecture

Jianbin_F_ · ‎08-08-2013

TimP (Intel) wrote:

The short story is that scalar floating point is done by using one slot in the mm512 registers, and the basic instruction name is unchanged.

OK, but when I measure the vectorization intensity (VPU_ELEMENTS_ACTIVE/VPU_INSTRUCTIONS_EXECUTED), I found that the VI number is around 4 for GEMM (with -no-vec, -mmic, and -O3), rather than 1. Therefore, I do not aggree with the opintion that the instruction uses only one slot out of eight. I wonder whether you have any idea of it?

Jianbin_F_ · ‎08-08-2013

Kevin Davis (Intel) wrote:

The use of vector instructions is a by-product of the architecture. I expect you are using -mmic. The compiler honors -no-vec when compiling for native. Add a -vec-report option (e.g. -vec-report6) to verify code is vectorized at -O2 or -O3 and then use that same report option with -no-vec to confirm it disabled vectorization.

Here’s some additional resources:

Intel® Xeon Phi™ Coprocessor Developer Zone (On the Overview tab, look for the link to the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual)

Intel® Xeon Phi™ Coprocessor Vector Microarchitecture

Programming and Compiling for Intel® Many Integrated Core Architecture

You are right that the use of vector instructions is a by-product. But what do you mean by the compiler will honor the -no-vec option? In my opinion, using -no-vec means that only one of the 8 or 16 lanes should be used (as TimP has mentioned). But my experimental results show that the vectorization intensity on GEMM (with -no-vec, -mmic, and -O3) is around 4, rather than 1. Indeed, the -vec-report message does not show any vectorization information, but how can you explain the vectorization instruction was using more than one lane/slot of the VPU?

Jianbin

Kevin_D_Intel · ‎08-12-2013

-no-vec disables auto-vectorization only. Development confirmed this and also indicated the option does not prevent scalar code from using vector instructions in any way it likes, and even on Xeons, that it does not prevent scalar code from using packed instructions.

Jianbin_F_ · ‎08-12-2013

Kevin Davis (Intel) wrote:

-no-vec disables auto-vectorization only. Development confirmed this and also indicated the option does not prevent scalar code from using vector instructions in any way it likes, and even on Xeons, that it does not prevent scalar code from using packed instructions.

Hi Kevin, thank you for your answer.