Why avx512 intructions cannot accelarate my application?

dingjun_chencmgl_ca · ‎02-06-2018

Dear Intel technician,

On SKYLAKE computer, I built my application with AVX512 instead of SSE2 with Intel parallel_studio_xe_2018.1.038, my application is worse than before with SSE2 in performance. I mean my application takes longer time to run than before with SSE2 on SKYLAKE computers. I cannot understand it. Could Intel tech support staff tell me what caused such a poor performance of AVX512 instructions? Thanks in advance.

Bby the way, do you have any Fortran OpenMP-based test programmes for public to confirm the performance of AVX512 instructions on SKYLAKE computers?

My application is Fortran codes with OpenMP parallelism. It can run on both Linux and Windows and is built with Intel 2018 Fortran Compiler. The tests are done on 2-sockets or 4 -sockets skylake multi-core computers.

Key options used under Linux:

OPENMP_FLAGS = -qopenmp -arch CORE-AVX512 -align array64byte
BASE_FLAGS = -O3 -auto -cm -w -fpp -DPN -DF95 -DLINUX -DLINUX_X64 -DOPENMP_VER -DSR3_LIB -threads

I am looking forward to hearing from you. Your early response is highly appreciated.

Dingjun

Devorah_H_Intel · ‎02-06-2018

Please refer to this article:

https://software.intel.com/en-us/articles/tuning-simd-vectorization-when-targeting-intel-xeon-processor-scalable-family

In addition to Release Notes: -qopt-zmm-usage option

https://software.intel.com/en-us/articles/intel-fortran-compiler-180-for-linux-release-notes-for-intel-parallel-studio-xe-2018#new_options

dingjun_chencmgl_ca · ‎02-07-2018

Hi, Devorah,

I am trying to re-run our application in terms of your advices.

If the following options are used, the elapsed time of our application is 7401.82 seconds

Key options used under Linux:

OPENMP_FLAGS = -qopenmp -arch SSE2 -align array16byte
BASE_FLAGS = -O3 -auto -cm -w -fpp -DPN -DF95 -DLINUX -DLINUX_X64 -DOPENMP_VER -DSR3_LIB -threads

If the following options are used, the elapsed time of our application is 8006.94 seconds. The performance deteriorates.

OPENMP_FLAGS = -qopenmp -arch CORE-AVX512 -xCORE-AVX512 -align array64byte -qopt-zmm-usage=high
BASE_FLAGS = -O3 -auto -cm -w -fpp -DPN -DF95 -DLINUX -DLINUX_X64 -DOPENMP_VER -DSR3_LIB -threads -fimf-force-dynamic-target -fimf-use-svml=true

The Intel Fortran Compiler ifort.exe is '/opt/intel/parallel_studio_xe_2018.1.038/compilers_and_libraries_2018/linux/bin/intel64/ifort'.

Could you give me more help? I want to accelerate our application due to use of AVX512 instructions on our SKYLAKE computers.

I look forward to hearing from you.

Best regards,

Dingjun

dingjun_chencmgl_ca · ‎02-08-2018

If I want to upgrade our OpenMP based Fortran application with AVX512 instead of SSE2 under Linux, could someone tell me what I should revise in MAKEFILE and Fortran source codes so that our application with AVX512 can run faster than that with SSE2 on our SKLYLAKE computers? Thanks in advance.

I look forward to your reply! Urgency!

Best regards,

Dingjun

Devorah_H_Intel · ‎02-08-2018

This issue is better to be reported via our Online Service Center at https://supporttickets.intel.com/ for further investigation. We would need to look at the sources, in addition to other information.
Instructions on how to file a ticket are available here:
https://software.intel.com/en-us/articles/how-to-create-a-support-request-at-online-service-center