Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Divakar_V_
Beginner
68 Views

icpc not generating AVX/AVX2 instructions

I compile the following C function

 

void multIJK(double *restrict a, double *restrict b, double *restrict c, 
	     int dim){
  for(int i=0; i < dim; i++)
    for(int j=0; j < dim; j++)
      for(int k=0; k < dim; k++)
	c[i+j*dim] += a[i+k*dim]*b[k+j*dim];
}

using the following options:

icpc  -O3 -prec-div -no-ftz -restrict -Wshadow -MMD -MP -fno-inline-functions -mkl -fno-verbose-asm  -S xchg-mult.cpp

The surprising thing is that icpc version 15 generates code as if YMM registers do not exist on both machines with AVX and AVX2. On the other hand the same compilation command with --mmic generates code using ZMM registers. This behavior is typical and holds across a range of programs.

I notice that avx instructions are generated with the -fast option. Can you please clarify exactly what options are needed to make the compiler generate avx and avx2 instructions?

 

Thanks!

0 Kudos
2 Replies
Divakar_V_
Beginner
68 Views

-xHost does it. You may consider this topic closed.

Kittur_G_Intel
Employee
68 Views

Hi,
You can explicitly use -xAVX (for generating AVX code) or -xCORE-AVX2 (for AVX2 code generation). Of course, you can always use -xHOST and the compiler will use the highest available  instruction set on the system.

BTW, -fast option is really " -xHOST -O3 -ipo -no-prec-div -static -fp-model fast=2"  and since it includes -xHOST it will of course use the highest instruction set available on your system.

_Kittur