- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Message Edited by jg-kuk@ispl.snu.ac.kr on 04-22-200611:08 AM
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
according our expert, amount of processed data in functions ippsFFTFwd_CToC_32fcand ippsFFTFwd_RToPerm is the same, but in the last function more calculations are needed. It is the reason why this function work slower for the same buffer size
Regards,
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Algorithm of 2N real FFT in ipps consist from to stages:
-calculation of N complex FFT (usually radix-4 used)
-calculation of recombination (like radix-2 iteration)
Vladimir
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Message Edited by SpiegalKuk on 05-12-200610:24 AM
Message Edited by SpiegalKuk on 05-12-200610:32 AM
Message Edited by SpiegalKuk on 05-12-200610:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Probably I was not clear enough, so will try to explain it again.
In case of N point complex FFT we use radix-4 for N=4^k and for N=2*4^k additional iteration of radix-2 required.
2N point real FFT implemented as N point complex FFT and recombination.
So, 2N point real FFT has more cache misses because it consist from two parts (every with own cache misses): N point complex FFT and recombination.
Regards,
Vladimir

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page