Community
cancel
Showing results for 
Search instead for 
Did you mean: 
gn164
Beginner
183 Views

Floating Point Exception in MKL FFT from 18.0.4 onwards.

A floating point overflow is raised in the code below giving the following backtrace.

Program received signal SIGFPE, Arithmetic exception.
0x0000000000d4a1e6 in mkl_dft_avx2_coDFTColTwid_Compact_Fwd_v_10_s ()
(gdb) backtrace
#0  0x0000000000d4a1e6 in mkl_dft_avx2_coDFTColTwid_Compact_Fwd_v_10_s ()
#1  0x00000000005e6e0d in compute_colbatch_fwd ()
#2  0x00000000004057dc in MAIN__ ()

 

The same code runs fine with a previous version of mkl (11.1.1) or if the CNR mode is set to SSE4_2.

Seems something specific to the avx2 code path.

 

program mkl_test

   USE MKL_DFTI

  include  'mkl.fi'

  integer, parameter :: len_i = 1025
  integer, parameter :: len_j = 1920
  complex :: values_in(len_i * len_j)
  complex :: values_out(len_i * len_j)
  real :: temp_r, temp_i
  integer :: ieee_flags
  character*16 :: out
  integer :: i, j, unit,  status
  integer stride_in(2)
  integer stride_out(2)

  type(dfti_descriptor), pointer :: My_Desc1_Handle
!---------------------------------------------------------------------------------------------------  
  values_out(:) = cmplx(0,0)

  print*, "Started and reading in data..."
  open(unit, file='data2_CFFT.txt')
  do j=1, len_j
    do i=1, len_i
      read(unit, '(2f15.8)') temp_r, temp_i
      values_in((j-1) * len_i + i) = cmplx(temp_r,temp_i)
    enddo
  enddo
  close(unit)
  print*, "Done reading data"

!  status = mkl_cbwr_set(MKL_CBWR_SSE4_2)
!  if(status .ne. MKL_CBWR_SUCCESS ) then
!     print *, 'unable to set the mkl environment'

!  endif

  i = ieee_flags('set', 'exception', 'overflow', out)

  stride_in(0)=0;
  stride_in(1)=1025;
  stride_out(0)=0;
  stride_out(1)=1025;

  status = DftiCreateDescriptor(My_Desc1_Handle,DFTI_SINGLE,DFTI_COMPLEX,1,1920)
  status = DftiSetValue(My_Desc1_Handle, DFTI_PLACEMENT, DFTI_NOT_INPLACE);
  status = DftiSetValue(My_Desc1_Handle, DFTI_NUMBER_OF_TRANSFORMS, 1025);
  status = DftiSetValue(My_Desc1_Handle, DFTI_INPUT_DISTANCE, 1);
  status = DftiSetValue(My_Desc1_Handle, DFTI_OUTPUT_DISTANCE, 1);
  status = DftiSetValue(My_Desc1_Handle, DFTI_INPUT_STRIDES, stride_in);
  status = DftiSetValue(My_Desc1_Handle, DFTI_OUTPUT_STRIDES, stride_out);
  status = DftiCommitDescriptor(My_Desc1_Handle);

  status = DftiComputeForward( My_Desc1_Handle, values_in, values_out )

  print*, "Finished successfully."

end program mkl_test

Compile as follows:

$INTEL_HOME/ifort -I$MKL_HOME/include/ cpbtrs.f90 -Wl,--start-group -Wl,-Bstatic -L$MKL_HOME_LIB/lib -lmkl_core -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack95_lp64 -liomp5 -Wl,--end-group

The input file data2_CFFT.txt that is read and passed to the FFT funtion is attached. Input all looks normal.

MKL 2019 u2 seems to have the same issue.

I am using linux debian 9 on a Intel(R) Xeon(R) CPU E3-1240 v3. Can someone please have a look.

0 Kudos
7 Replies
Gennady_F_Intel
Moderator
183 Views

Actually version 2019 u2 contains some security updates but not a bug fixes. Could you please try the latest MKL 2019 u3?

gn164
Beginner
183 Views

Greetings Gennady,

I do not have access to MKL 2019 u3 yet and it may take a while until I can have. Can you reproduce the problem on your side using 18.0.4 or 19.0.2? Is there a workaround that you could suggest without needing to change versions?

 

 

 

Gennady_F_Intel
Moderator
183 Views

yes, we see the same problem with version 2019.3. the problem will be investigated.

You may try to switch off ieee_flags and try to check.

gn164
Beginner
183 Views

Greetings Gennady,

Thanks, If I switch off FPE it is running fine. It would be great if this function can be fixed to not raise an FPE.

 

 

Gennady_F_Intel
Moderator
183 Views

Hello, The fix of the issue available in MKL v.2019.5 which we released the last week. Please try this update and let us know if the problem still exists on your side. thanks.

gn164
Beginner
183 Views

Hi Gennady,

Thank you for the fix I have tried version  2019.5 and I can confirm that it works fine.

Can you share any details why the FPE was raised in this case, is it data dependent?

Thanks

Kirill_V_Intel
Employee
183 Views

Hello gn164,

Yes, in addition to other things, the behavior was data specific (now it is fixed).

Best,
Kirill

Reply