Gaussian filter in frequency domain (FFT) question

cks2k2 · ‎01-07-2013

I'm trying to do gaussian filtering of image in frequency domain (FFT) w/ IPP.
For image of size MxN and gauss kernel UxV, the steps:
1. zero-pad kernel to (M+U-1)x(N+V-1)
2. Fwd FFT kernel
3. zero-pad img to (M+U-1)x(N+V-1)
4. Fwd FFT img
5. multiply the 2 FFTs in float-complex format
4. Inv FFT the multiplied result
5. Trim edges

My resulting image looks a little strange. The best result I have is when I border extend both kernel and img but keep both left border width and top border height to 0.
Can the experts have a look to see what I'm doing wrong? NOTE: The test code below purposely uses a small kernel. Real code would be using very large kernels.

[cpp]

void foo(Ipp8u* img, int lineStep, int width, int height)

{

IppStatus status;

// assume kernel is 5x5 with sigma = 1

int kernelStep;

Ipp32f* kernel = ippiMalloc_32f_C1(5, 5, &kernelStep);

IppiSize kernelRoi = { 5, 5 };

kernel[0] = 0.0030f; kernel[1] = 0.0133f; kernel[2] = 0.0219f; kernel[3] = 0.0133f; kernel[4] = 0.0030f;

kernel[8] = 0.0133f; kernel[9] = 0.0596f; kernel[10] = 0.0983f; kernel[11] = 0.0596f; kernel[12] = 0.0133f;

kernel[16] = 0.0219f; kernel[17] = 0.0983f; kernel[18] = 0.1621f; kernel[19] = 0.0983f; kernel[20] = 0.0219f;

kernel[24] = 0.0133f; kernel[25] = 0.0596f; kernel[26] = 0.0983f; kernel[27] = 0.0596f; kernel[28] = 0.0133f;

kernel[32] = 0.0030f; kernel[33] = 0.0133f; kernel[34] = 0.0219f; kernel[35] = 0.0133f; kernel[36] = 0.0030f;

// pad kernel

IppiSize padRoi = { width + 4, height + 4 };

// 2d fft padded kernel

IppiDFTSpec_C_32fc* spec;

status = ippiDFTInitAlloc_C_32fc(&spec, padRoi, IPP_FFT_DIV_INV_BY_N, ippAlgHintFast);

// extend border by 4 pixels on each dimension

int extKernelStep;

Ipp32f* extKernel = ippiMalloc_32f_C1(width + 4, height + 4, &extKernelStep);

status = ippiCopyConstBorder_32f_C1R(kernel, kernelStep, kernelRoi, extKernel, extKernelStep, padRoi, 0, 0, 0.f);

int fftKernelStep;

Ipp32fc* fftKernel = ippiMalloc_32fc_C1(width + 4, height + 4, &fftKernelStep);

// copy over, then Fwd FFT it

int kernelCorrectedStep = extKernelStep/4;

int kernelFcStep = fftKernelStep/8;

for(int h = 0; h < height + 4; ++h)

{

Ipp32f* currentKernelRow = extKernel + (h * kernelCorrectedStep);

Ipp32fc* currentFFTKernelRow = fftKernel + (h * kernelFcStep);

for(int w = 0; w < width + 4; ++w)

{

Ipp32f* currentKernel = currentKernelRow + w;

Ipp32fc* currentFFTKernel = currentFFTKernelRow + w;

(*currentFFTKernel).re = *currentKernel;

(*currentFFTKernel).im = 0.f;

}

// FWD kernel

status = ippiDFTFwd_CToC_32fc_C1IR(fftKernel, fftKernelStep, spec, NULL);

// convert img to 32f

IppiSize imgRoi = { width, height };

int imgFStep;

Ipp32f* imgF = ippiMalloc_32f_C1(width, height, &imgFStep);

status = ippiConvert_8u32f_C1R(img, lineStep, imgF, imgFStep, imgRoi);

// extend

int extImgStep;

Ipp32f* extImg = ippiMalloc_32f_C1(width + 4, height + 4, &extImgStep);

//status = ippiCopyReplicateBorder_32f_C1R(imgF, imgFStep, imgRoi, extImg, extImgStep, padRoi, 0, 0);

status = ippiCopyConstBorder_32f_C1R(imgF, imgFStep, imgRoi, extImg, extImgStep, padRoi, 0, 0, 0.f);

int fftImgStep;

Ipp32fc* fftImg = ippiMalloc_32fc_C1(width + 4, height + 4, &fftImgStep);

int imgCorrectedStep = extImgStep/4;

int imgFcStep = fftImgStep/8;

for(int h = 0; h < height + 4; ++h)

{

Ipp32f* currentImgRow = extImg + (h * imgCorrectedStep);

Ipp32fc* currentFFTImgRow = fftImg + (h * imgFcStep);

for(int w = 0; w < width + 4; ++w)

{

Ipp32f* currentImg = currentImgRow + w;

Ipp32fc* currentFFTImg = currentFFTImgRow + w;

(*currentFFTImg).re = *currentImg;

(*currentFFTImg).im = 0.f;

}

// FWD img

status = ippiDFTFwd_CToC_32fc_C1IR(fftImg, fftImgStep, spec, NULL);

// now multiply both

status = ippiMul_32fc_C1IR(fftKernel, fftKernelStep, fftImg, fftImgStep, padRoi);

// inv FFT

status = ippiDFTInv_CToC_32fc_C1IR(fftImg, fftImgStep, spec, NULL);

// copy back out to extImg

int fftCStep = fftImgStep/8;

int cImgStep = extImgStep/4;

for(int h = 0; h < height + 4; ++h)

{

Ipp32fc* currentFFTRow = fftImg + (h * fftCStep);

Ipp32f* currentRow = extImg + (h * cImgStep);

for(int w = 0; w < width + 4; ++w)

{

Ipp32fc* currentFFT = currentFFTRow + w;

Ipp32f* currentImg = currentRow + w;

*currentImg = (*currentFFT).re;

}

Ipp32f* startPt = extImg + (2 * cImgStep) + 2; // looks better?

//Ipp32f* startPt = extImg; // shifted???

status = ippiConvert_32f8u_C1R(startPt, extImgStep, img, lineStep, imgRoi, ippRndNear);

}

[/cpp]

SergeyKostrov · ‎01-07-2013

Could you upload your source and resulting images ( in a raw-format or as text-images )?

cks2k2 · ‎01-08-2013

Hi Sergey, I uploaded the source code (some changes made i.e. using 5x5 kernel) and the test/result images. beach.bmp is the test image (1 channel). beach_blur.bmp is the result - the borders seem incorrect. I also tried using ippiFilterGauss with 5x5 kernel and the result is quite different from blurring using FFT.

SergeyKostrov · ‎01-08-2013

Hi, Thanks for these uploads! >>I uploaded the source code (some changes made i.e. using 5x5 kernel) and the test/result images. >>beach.bmp is the test image (1 channel). >>beach_blur.bmp is the result - the borders seem incorrect. To be honest, I don't see any artifacts, or defects, in the resulting image after filtering. I checked attributes for both bmp-files and I see that sizes are 500x333. So, both numbers are not power of 2 and I would verify a zero-padding steps #1 and #3: (M+U-1)x(N+V-1). What values for M, N, U and V did you use during your processing?

cks2k2 · ‎01-08-2013

Hi, for M and N I just use the image width and height (500, 333). For U and V I use kernel width and height (5, 5). With DFT isn't it not necessary to have numbers which are power of 2?

SergeyKostrov · ‎01-09-2013

>>...With DFT isn't it not necessary to have numbers which are power of 2? I checked IPP headers and docs and I didn't find any constraints.