Dear Intel technician,
On SKYLAKE computer, I built my application with AVX512 instead of SSE2 with Intel parallel_studio_xe_2018.1.038, my application is worse than before with SSE2 in performance. I mean my application takes longer time to run than before with SSE2 on SKYLAKE computers. I cannot understand it. Could Intel tech support staff tell me what caused such a poor performance of AVX512 instructions? Thanks in advance.
Bby the way, do you have any Fortran OpenMP-based test programmes for public to confirm the performance of AVX512 instructions on SKYLAKE computers?
My application is Fortran codes with OpenMP parallelism. It can run on both Linux and Windows and is built with Intel 2018 Fortran Compiler. The tests are done on 2-sockets or 4 -sockets skylake multi-core computers.
Key options used under Linux:
OPENMP_FLAGS = -qopenmp -arch CORE-AVX512 -align array64byte
BASE_FLAGS = -O3 -auto -cm -w -fpp -DPN -DF95 -DLINUX -DLINUX_X64 -DOPENMP_VER -DSR3_LIB -threads
I am looking forward to hearing from you. Your early response is highly appreciated.