Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

convolution types


I'm attempting to translate a matlab image convolution (conv2) to c++ using the ipp libs. My issue is that the only options provided by the ippiConv_ are 'valid' and 'full'. conv2 in matlab provides 'valid', 'full', and 'same' of which the code i'm trying to translate uses the 'same' option. I have the 'valid' option working, however the borders of the image don't match the output of matlab (obviously). Is there a way to match the 'same' functionality? Also, if i try to do a 'full' convolution using ippiConv, my returned buffer size is incorrect. If i try to hard code the size to something big, then run the convolution it crashes...not sure why. Can someone help?

Sample code below.

  Ipp8u* pBuffer        = nullptr;
  IppStatus status      = ippStsNoErr;
  IppEnum algType       = (IppEnum)(ippAlgAuto | ippiROIValid);
  int32_t buffSizeBytes = 0;

  status = ippiConvGetBufferSize(srcRoiSize, tplRoiSize, ipp32f, 1, algType, &buffSizeBytes);

  pBuffer = ippsMalloc_8u(buffSizeBytes);

  status = ippiConv_32f_C1R(
    srcImg.Cols() * sizeof(float),
    tpl.Cols() * sizeof(float),
    outImg.Cols() * sizeof(float),

  if (pBuffer != nullptr) ippsFree(pBuffer);


0 Kudos
2 Replies


the "same" option is not implemented intentionally - because it is not symmetrical. If you need the "same" case - there are 2 options:

1) use the "full" option and then extract "same" ROI. It's rather easy - for "full" dstSize=src1Size+src2Size-1, - to have dstSize the same as src1Size  you just need to shift the pointer on the corresponding number of rows and columns - ((src2Size-1)/2) and use "old" step (dstStep) with src1 width and height. ippiConv is based internally on 2D FFT (for rather appropriate image sizes) - therefore performance will be the same.

2) if your convolution kernel (src2) is rather small in comparison with src1 - you can use ippiFilterBorder function - it will provide you the "same" dstSize. There is 1 additional step that is required for the IPP versions 9.0 and higher - you need to flip the src2 (kernel) yourself if it is not symmetrical. (We think that the most of filter kernels are symmetrical and that internal flipping is extra operation that affects performance, requires additional memory buffer, and in the most cases - excess).

regards, Igor

0 Kudos

Igor, thanks for the quick response and guidance. 

1. Based on your response, I realized that my code was crashing on the 'full' convolution because my output image was not big enough.

2. I implemented your suggestions and was able to achieve the 'same' output to match that of what conv2 returns in matlab when using the 'same' flag. See below for additional copy step to extract portion of image. Thanks again.


  // Copy the 'same' portion out of the 'full' image
  const int32_t cpSrcStep = tmpImg.Cols() * sizeof(tmpImg[0][0]);
  const int32_t cpDstStep = srcImg.Cols() * sizeof(srcImg[0][0]);
  const IppiSize roiSize  = { outImg.Cols(), outImg.Rows() };
  const uint32_t srcShift = tplRoiSize.height / 2;
  ippiCopy_32f_C1R(&tmpImg[srcShift][srcShift], cpSrcStep, outImg[0], cpDstStep, roiSize);


0 Kudos