SSE2 based FFT in Intel MKL

dtrumer · ‎12-29-2003

Hello,

while reading the MKL description - I could not understaned whether the MKL FFT is based on SSE2 or not. This can effect the performance dramaticly.

Thanks,

Dror

TimP · ‎12-31-2003

Most likely, the double precision FFT functions in the p4 library will be SSE2. You could test p4 against generic architecture versions on your application.

Wendy_Doerner__Intel · ‎12-31-2003

The Intel Math Kernel Library Reference Manual, that is in the doc/mklman61.pdf file in the Intel MKL 6.1 installation, does state on page 1-4that the DFTs are optimized to take advantage of processor specific SIMD extensions. To see the speed up for your application; the suggestion to run it with the generic processor codeand compare it to your processor is a good one. Note that the DFT functions are the ones continuing to be optimized for new processors (as opposed to the FFTs).

Intel_C_Intel · ‎01-08-2004

The power-of-two FFTs do use the SIMD hardware, both for single-precision and double-precision transforms. Up until now the mixed-radix transforms (DFT) have not used the SIMD hardware for non-power-of-two transforms, leading to a significant difference in performance between, say, a 1024-point transform and 900-point or 1152-point transform. With the 7.0 beta software, there are dramatic improvements in those non-power-of-two transform lengths.