Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

about FFT

jg-kuk
Beginner
887 Views
Hi,
I'm implementing a filtering using FFT. ( but it's not in a conventional way)
Comparing the conventional method with my algorithm, I had a guestion about FFT in IPP.
According to my simulation results, N point complex FFT has the same pBuffer size asa 2N point real FFT. To be specific, two functions are ippsFFTFwd_CToC_32fc and ippsFFTFwd_RToPerm_32f. But in all lengths, N point complex FFT is faster than 2N point realFFT in spite of same pBuffer sizes.
I want to know the reason for that..

Message Edited by jg-kuk@ispl.snu.ac.kr on 04-22-200611:08 AM

0 Kudos
5 Replies
Vladimir_Dudnik
Employee
887 Views

Hello,

according our expert, amount of processed data in functions ippsFFTFwd_CToC_32fcand ippsFFTFwd_RToPerm is the same, but in the last function more calculations are needed. It is the reason why this function work slower for the same buffer size

Regards,
Vladimir

0 Kudos
spiegalkuk
Beginner
887 Views
thanks.
Ithink there are more muls and adds in the 2N real FFT than N cplx FFT.
Are thereanother significant calculations make the N cplx FFT more faster than 2N real FFT except for mul and add?
and if it is allowed, could I know the methodforreal fft implementation in IPP? for instance, radix-2 butterfly structure, split-radix fft method and etc..
thank for your help.
0 Kudos
Vladimir_Dudnik
Employee
887 Views

Algorithm of 2N real FFT in ipps consist from to stages:
-calculation of N complex FFT (usually radix-4 used)
-calculation of recombination (like radix-2 iteration)

Vladimir

0 Kudos
spiegalkuk
Beginner
887 Views
I'm sorry but, I can't understand that 2N point real FFT consist of N point cplx FFT..Doesn't it needtwo N point cplx FFTs?
and I can't follow that N point cplx FFT in IPPs is usually implemented by radix-4 algorithm..single radix-4 algorithm only operates when FFT length is powers of 4...
Am I missing something?..
Sorry to bother,last question..
Among 2N point real FFTand N point cplx FFT, which onedoesmakemore cashe miss under same condition(same cashe size)?
Thank you for your help.

Message Edited by SpiegalKuk on 05-12-200610:24 AM

Message Edited by SpiegalKuk on 05-12-200610:32 AM

Message Edited by SpiegalKuk on 05-12-200610:32 AM

0 Kudos
Vladimir_Dudnik
Employee
887 Views

Probably I was not clear enough, so will try to explain it again.

In case of N point complex FFT we use radix-4 for N=4^k and for N=2*4^k additional iteration of radix-2 required.

2N point real FFT implemented as N point complex FFT and recombination.

So, 2N point real FFT has more cache misses because it consist from two parts (every with own cache misses): N point complex FFT and recombination.

Regards,
Vladimir

0 Kudos
Reply