- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I've come across a strange error where DftiCommitDescriptor fails using Intel-2024 using a certain compilation syntax.
This compilation syntax worked with the Intel-2021, but now leads to a failure when compiling with the Intel-2024 software stack.
Please see the following reproducer:
```fortran90
! fft_reproducer.f90
Compiled with the following Makefile:
```make
If I do:
```
module load intel/intel-2021
export FC=ifort
make clean; make OLD=x; ./fft_reproducer # compiles, runs successfully
make clean; make NEW=x; ./fft_reproducer # compiles, runs successfully
```
However, if I do:
```
module load intel/intel-2024
export FC=ifx
make clean; make OLD=x; ./fft_reproducer # compiles, fails at DftiCommitDescriptor with status 3 (invalid configuration)
make clean; make NEW=x; ./fft_reproducer # compiles, runs successfully
```
It's worth also mentioning that this is a small section of my code (which is heavily reliant upon FFT); although it passes the DftiCommitDescriptor with the "NEW" style of compilation using Intel-2024, the final result is incorrect (looks like indexing errors).
However, when I revert back to using Intel-2021 and either the "OLD" or "NEW" styles of compilation, the DftiCommitDescriptor passes and the final result is correct. I'm certain that it is the FFT routine that is at fault because I replaced MKL FFT with FFTW and the final result is also correct.
If someone can please explain this behaviour, I'd appreciate it.
Edit:
`module load intel/intel-2021` will point MKLROOT to `.../oneapi-2021.update.4/mkl/2021.4.0`
`module load intel/intel-2024` will point MKLROOT to `.../oneapi-2024.update.1/mkl/2024.1`
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please update your reproducer
Now it should work for Intel-2024. Please report back with your updates.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@gqchenATintel, thank you very much - this fixed my issue:
```
$ module load intel/intel-2024; export FC=ifx; make clean; make OLD=x; ./fft_reproducer;
rm -f fft_reproducer
ifx -fpp -DMKLI8 -I.../oneapi-2024.update.1/mkl/2024.1/include/intel64/ilp64 -o fft_reproducer fft_reproducer.f90 -qopenmp -Wl,--start-group .../oneapi-2024.update.1/mkl/2024.1/lib/intel64/libmkl_blas95_ilp64.a .../oneapi-2024.update.1/mkl/2024.1/lib/intel64/libmkl_lapack95_ilp64.a .../oneapi-2024.update.1/mkl/2024.1/lib/intel64/libmkl_intel_ilp64.a .../oneapi-2024.update.1/mkl/2024.1/lib/intel64/libmkl_core.a .../oneapi-2024.update.1/mkl/2024.1/lib/intel64/libmkl_intel_thread.a -Wl,--end-group
len= 256
num= 10
dist_in= 258
dist_out= 129
strides_in= 0 1
strides_out= 0 1
My FFT setup complete!
```
And:
```
$ module load intel/intel-2024; export FC=ifx; make clean; make NEW=x; ./fft_reproducer;
rm -f fft_reproducer
ifx -fpp -DMKLI8 -I.../oneapi-2024.update.1/mkl/2024.1/include/intel64/ilp64 -o fft_reproducer fft_reproducer.f90 -qmkl-ilp64=parallel
len= 256
num= 10
dist_in= 258
dist_out= 129
strides_in= 0 1
strides_out= 0 1
My FFT setup complete!
```
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@gqchenATintel is there any way to keep the original behaviour? That would presumably mean I'd have to change my indexing for all of the functions that use the output complex-valued array, which I'd prefer to avoid.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Unfortunately, the answer is no, because of the output type change from complex_real to complex_complex (since oneAPI 2023.0).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@gqchenATintel: I managed to get the same behaviour in both APIs by including the following:
status = DftiSetValue(plan, DFTI_PLACEMENT, DFTI_INPLACE) status = DftiSetValue(plan,DFTI_CONJUGATE_EVEN_STORAGE,DFTI_COMPLEX_REAL) status = DftiSetValue(plan,DFTI_PACKED_FORMAT,DFTI_CCS_FORMAT)
Per this example, compiled using your suggested `-qmkl-ilp64=parallel`:
program fft_reproducer
use MKL_DFTI
implicit none
type(DFTI_DESCRIPTOR), pointer :: plan
integer ier
#ifdef MKLI8
integer(kind=8) :: dim, len, num, dist_in, dist_out, strides_in(2), strides_out(2), status
integer(kind=8) :: i, j
#else
integer(kind=4) :: dim, len, num, dist_in, dist_out, strides_in(2), strides_out(2), status
integer(kind=4) :: i, j
#endif
real(kind=4), allocatable :: x(:), y(:)
real(kind=4) :: max_diff, threshold
! Initialize parameters
num = 2
dim = 1
len = 16
dist_in = 2*(len/2+1) ! 2 * (len / 2 + 1)
dist_out = dist_in
strides_in = (/0, 1/)
strides_out = (/0, 1/)
! Allocate and initialize test x
allocate(x(dist_in*num))
allocate(y(dist_in*num))
! Initialize with a simple sinusoidal pattern
do j = 1, num
do i = 1, len
x((j-1)*dist_in + i) = sin(2.0 * 3.14159 * real(i-1) / real(len))
end do
do i = len+1, dist_in
x((j-1)*dist_in + i) = 0.0
end do
end do
! Save original x for comparison
y(:) = x(:)
print *, "Parameters:"
print *, "len=", len
print *, "num=", num
print *, "dist_in=", dist_in
print *, "dist_out=", dist_out
print *, "strides_in=", strides_in
print *, "strides_out=", strides_out
! Create descriptor for 1D real-to-complex FFT
status = DftiCreateDescriptor(plan, DFTI_SINGLE, DFTI_REAL, dim, len)
if (status /= 0) then
print *, "Error in DftiCreateDescriptor:", status
call exit(status)
endif
! Set FFT options
status = DftiSetValue(plan, DFTI_CONJUGATE_EVEN_STORAGE, DFTI_COMPLEX_REAL)
status = DftiSetValue(plan, DFTI_PACKED_FORMAT, DFTI_CCS_FORMAT)
status = DftiSetValue(plan, DFTI_PLACEMENT, DFTI_INPLACE)
status = DftiSetValue(plan, DFTI_NUMBER_OF_TRANSFORMS, num)
status = DftiSetValue(plan, DFTI_INPUT_DISTANCE, dist_in)
status = DftiSetValue(plan, DFTI_OUTPUT_DISTANCE, dist_out)
status = DftiSetValue(plan, DFTI_INPUT_STRIDES, strides_in)
status = DftiSetValue(plan, DFTI_OUTPUT_STRIDES, strides_out)
! Scale factor for backward transform
status = DftiSetValue(plan,DFTI_FORWARD_SCALE,real(1.0))
status = DftiSetValue(plan, DFTI_BACKWARD_SCALE, 1.0/real(len))
! Commit the descriptor
status = DftiCommitDescriptor(plan)
if (status /= 0) then
print *, "Error in DftiCommitDescriptor:", status
call exit(status)
endif
! Perform forward transform
print *, "Performing forward FFT..."
status = DftiComputeForward(plan, x)
if (status /= 0) then
print *, "Error in forward transform:", status
call exit(status)
endif
! Perform backward transform
print *, "Performing backward FFT..."
status = DftiComputeBackward(plan, x)
if (status /= 0) then
print *, "Error in backward transform:", status
call exit(status)
endif
! Verify results
max_diff = 0.0
threshold = 1.0e-5 ! Adjust based on your precision requirements
print *, "=================================="
do j = 1, num
do i = 1, len
print *, x((j-1)*dist_in + i), y((j-1)*dist_in + i)
max_diff = max(max_diff, abs(x((j-1)*len + i) - y((j-1)*len + i)))
end do
print *, "=================================="
end do
print *, "Maximum difference between original and reconstructed x:", max_diff
if (max_diff < threshold) then
print *, "FFT validation PASSED!"
else
print *, "FFT validation FAILED! Difference exceeds threshold of", threshold
end if
! Deallocate resources
status = DftiFreeDescriptor(plan)
deallocate(x)
deallocate(y)
end program fft_reproducer
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It seems like you've pinpointed an interesting issue with the Intel-2024 stack and MKL, especially given that the compilation works fine with Intel-2021 and also with FFTW as an alternative. It may be worth double-checking MKL’s updated documentation to see if Intel-2024 has specific handling or changed support for older configurations in the DFTI descriptor. You might also consider reaching out to Intel’s support to see if there’s a compatibility issue or if any known issues with exist in the newer stack.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page