
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I compile the following C function
void multIJK(double *restrict a, double *restrict b, double *restrict c, int dim){ for(int i=0; i < dim; i++) for(int j=0; j < dim; j++) for(int k=0; k < dim; k++) c[i+j*dim] += a[i+k*dim]*b[k+j*dim]; }
using the following options:
icpc -O3 -prec-div -no-ftz -restrict -Wshadow -MMD -MP -fno-inline-functions -mkl -fno-verbose-asm -S xchg-mult.cpp
The surprising thing is that icpc version 15 generates code as if YMM registers do not exist on both machines with AVX and AVX2. On the other hand the same compilation command with --mmic generates code using ZMM registers. This behavior is typical and holds across a range of programs.
I notice that avx instructions are generated with the -fast option. Can you please clarify exactly what options are needed to make the compiler generate avx and avx2 instructions?
Thanks!
Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-xHost does it. You may consider this topic closed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
You can explicitly use -xAVX (for generating AVX code) or -xCORE-AVX2 (for AVX2 code generation). Of course, you can always use -xHOST and the compiler will use the highest available instruction set on the system.
BTW, -fast option is really " -xHOST -O3 -ipo -no-prec-div -static -fp-model fast=2" and since it includes -xHOST it will of course use the highest instruction set available on your system.
_Kittur

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page