topic About zero padding in time in Intel® Integrated Performance Primitives

efficient multiplale fft's

Noam_Z_1 — Mon, 18 Mar 2013 14:31:16 GMT

Hello,

I wan tto use fft to filter a large 2d image (~1000x1000). I want to filter the image several times each time with a different kernel (~21x21).

To benefit from the using the fft i can calculate the image fft once then only calculate each kernel's fft multiply the two and invFFT..

the problem is that the image is much larger than the kernel and zero padding the kernel to the image size looks to be very wasteful, is there a fft implementation that takes advantge of the zeros and computes the larger fft but withou the external zero padding?

Thnak you in advance,

Noam.

About zero padding in time

Thomas_Jensen1 — Mon, 18 Mar 2013 18:09:57 GMT

About zero padding in time domain, if src width = 1000 and kernel width = 21, then maximum required zero padding (to reduce ringing) would be 1000+21-1 = 1020.
If you just zero pad to 1024x1024 then you benefit from FFT (faster at powers of two, and requires powers of two).

Using DFT: faster at powers of two, and does not require powers of two.

However, if your ~1000 actually is 1024, then 1024+21-1 = 1044, and this is not efficient for FFT.
I consider IPP's DFT to be very fast, so you could just zero pad to 1044, then DFT this, then loop:
- Make kernel (in time domain + FFT/DFT to 1044)
- Muliply
- DFT-1

You must also consider not thinking about a "kernel", at since this is a time domain thing.
That said, IPP actually implements (time domain) convolution using DFT, if image dimension is > some value. Your 1000 is above that value...

So, you should make a test only using time domain convolutions to see if that is fast enough. Make sure to allow ippiConvolve to use multiple threads.

If you would leave some

Thomas_Jensen1 — Mon, 18 Mar 2013 18:10:53 GMT

If you would leave some feedback here, we'd all appreciate it...

thank you for your response.

Noam_Z_1 — Tue, 19 Mar 2013 07:12:52 GMT

thank you for your response.

What I meant was that it seems redundant to zero pad the small kernel (or patch\template whichever you prefer to call it) and do a DFT on the padded kernel since it will involve a lot of multiplications by zero.. I wanterd to know if the ipp has some DFT function that knows to use the fact that I want to do a 1024 sized FFT on a actual 21 sized image.