Re: Re:floating point exception in mkl fft with AVX512

gn164 · ‎10-30-2020

The following code shows a floating point overflow in FFT when AVX512 is enabled. If I disable AVX512 before running the program (setenv MKL_ENABLE_INSTRUCTIONS AVX2) it runs successfully.

program mkl_test

  USE MKL_DFTI

  include  'mkl.fi'

  parameter (len=110)
  real values_in(len)
  real values_out(len)
  real :: temp_r
  integer :: ieee_flags
  character*16 :: out
  integer :: i, j,   status
  type(dfti_descriptor), pointer :: My_Desc1_Handle
!---------------------------------------------------------------------------------------------------  

  open(unit =2, file='data_inc.txt')
  do i=1, 110
     read(2, * ) values_in(i)
  enddo
  close(unit)

  i = ieee_flags('set', 'exception', 'overflow', out)
  status = DftiCreateDescriptor(My_Desc1_Handle,DFTI_SINGLE,DFTI_REAL,1,len-2)
  status = DftiSetValue(My_Desc1_Handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
  status = DftiSetValue(My_Desc1_Handle, DFTI_NUMBER_OF_TRANSFORMS, 1);
  status = DftiCommitDescriptor(My_Desc1_Handle);

  status = DftiComputeForward( My_Desc1_Handle, values_in, values_out )

  print*, "Finished successfully." ,status

end program mkl_test

I am running the program on Xeon W-2145 CPU @ 3.70GHz

My link command as follows:

/opt/intel/19/bin/ifort reproduce.f90 -Wl,--start-group ${MKLROOT}/lib/intel64/libmkl_intel_lp64.a ${MKLROOT}/lib/intel64/libmkl_intel_thread.a ${MKLROOT}/lib/intel64/libmkl_core.a -Wl,--end-group -liomp5 -lpthread -lm -ldl -I${MKLROOT}/include/ -g

Please also find attached the input required to reproduce the problem (data_inc.txt).

gn164 · ‎10-31-2020

An update that I have also tried to use MKL 2020 update 4 and the issue persists using that as well.

It is worth to note that this looks like a data dependent issue. The FFT works fine with different input but something in this data seems to trigger the FPE under avx512.

Gennady_F_Intel · ‎11-01-2020

thanks for the case. We will check and keep you updated with the status of this issue.

Gennady_F_Intel · ‎11-01-2020

I see no issues with the latest mkl version 2020 u4.

Here is the log file I see with MKL_VERBOSE mode enabled.

export MKL_VERBOSE=1

$ ./a.out

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.50GHz intel_thread

MKL_VERBOSE FFT(srfo108,pack:ccs,tLim:1,unaligned_input,unaligned_output,desc:0x43f3600) 11.88us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:112

Finished successfully. 0

Gennady_F_Intel · ‎11-01-2020

ifort --version

ifort (IFORT) 19.1.3.302 20200925

lscpu | grep Model

Model name: Intel(R) Xeon(R) Platinum 8180 CPU @ 2.50GHz

OS - RH7.2

gn164 · ‎11-02-2020

Hi Gennady,

Thank you for having a look. I do not have access to that particular hardware but can also reproduce the

problem on a Xeon Silver 4110 CPU @ 2.10GHz.

The program still fails with floating overflow before showing any of the verbose messages.

If I disable the trap for FPEs (comment out line 22) the verbose information is as follows:

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.10GHz intel_thread
MKL_VERBOSE FFT(srfo108,pack:ccs,tLim:1,desc:0x31865c0) 8.96us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:8
Finished successfully. 0

If it helps, I am also copying the stack of the failing program.

Program received signal SIGFPE, Arithmetic exception.
0x00000000013e8721 in mkl_dft_avx512_ownsrDftFwdRecombine_32f ()
(gdb) bt
#0 0x00000000013e8721 in mkl_dft_avx512_ownsrDftFwdRecombine_32f ()
#1 0x00000000005ad164 in mkl_dft_avx512_ippsDFTFwd_RToCCS_32f ()
#2 0x00000000004afb2a in compute_1d_small_fwd ()
#3 0x0000000000405409 in mkl_dft_dfti_compute_forward_ss ()
#4 0x0000000000404dcd in mkl_avx512 () at reproduce.f90:30
#5 0x0000000000404ae2 in main ()
#6 0x00002aaaab80b2e1 in __libc_start_main (main=0x404ab0 <main>, argc=1, argv=0x7fffffffea68, 
init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffea58)
at ../csu/libc-start.c:291
#7 0x00000000004049aa in _start ()

Thanks

Gennady_F_Intel · ‎11-06-2020

running the same code built with mkl 2020 u4 (statically) on the very similar system ( Model name: Intel(R) Xeon(R) Silver 4116 CPU @ 2.10GHz), I see no segfault there:

MKL_VERBOSE Intel(R) MKL 2020.0 Update 4 Product build 20200917 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.10GHz intel_thread

MKL_VERBOSE FFT(srfo108,pack:ccs,tLim:1,unaligned_input,unaligned_output,desc:0x335c840) 8.65us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:24

Finished successfully. 0

Gennady_F_Intel · ‎04-04-2021

The issue is closing and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.