ippiFilter_32f_C1R - when used in static lib causes crash

Kuldip_R_ · ‎02-16-2016

We seeing consistent crash due to invalid memory access when using ippiFilter_32f_C1R, linking as static lib when called from multiple threads.

Pointers:

- Project is static linking IPP 7.0 library.
- ippiFilter_32f_C1R is used for sharpening images.
- there can be 100-5000 or more images, which are streamed and this method is called from multiple thread to sharpen each image individually.
- ippStaticInit is called at start.
- ippGetNumThreads returns 8, as i have 8 Core CPU. this is even happening for other cores configuration higher or lower.
- ROI provided - takes care of boundary assumption which IPP filter function needs.
-

Observation
- There is spike in kmp_launc_worker threads, waiting for instruction.
- multiple calls (serial or threaded) into ippiFilter_32f_C1R are resulting into more number of IPP threads getting created waiting, and at random point results in crash inside IPP.
- Diabling parallization fixes this issue - ippSetNumThreads(1).

is my issue something to do with Avoiding Nested Parallelization.
http://nf.nci.org.au/facilities/software/intel-ct/12.0.4.191/Documentation/en_US/ipp/ipp_userguide/ippugch6/Avoiding_Nested_Parallelization.htm

Image are 16bit per pixel monochrome images, so call sequence is like.

int imageSize =  m_Width* m_Height;

    Ipp32f *pSrc = new Ipp32f[imageSize];
    Ipp32f *pDst = new Ipp32f[imageSize];

    16bitData2IppImage(PixelData,  pSrc);

    const int kernelWidth = m_kernel->Width();
    const int kernelHeight = m_kernel->Height();

    const int kernelHalfWidth = kernelWidth/2;
    const int kernelHalfHeight = kernelHeight/2;

    int srcStep = format->m_Width*sizeof(Ipp32f);
    int dstStep = format->m_Width*sizeof(Ipp32f);

    IppiSize dstRoiSize =  {format->m_Width - 2*kernelHalfWidth, format->m_Height - 2*kernelHalfHeight};
    IppiSize kernelSize = {kernelWidth, kernelHeight};

    IppiPoint anchor = {kernelHalfWidth, kernelHalfHeight};
    int firstPixGap = format->m_Width*kernelHalfHeight + kernelHalfWidth;

    Ipp32f* pKernel = new Ipp32f[kernelWidth*kernelHeight];
    
    Matrix2IppKernel(m_kernel, pKernel); //function to copy internal matrix value to IPP float matrix.

    IppStatus stat = ippiFilter_32f_C1R(pSrc + firstPixGap, srcStep,
                                           pDst + firstPixGap, dstStep, dstRoiSize,
                                           pKernel, kernelSize, anchor);
    ASSERT(stat == ippStsNoErr);

    IppImage216bitData(data, pDst);

Gennady_F_Intel · ‎02-16-2016

The version 7.0 is no longer supported. The current version of IPP is 9.0. this API ( ippiFilter_32f_C1R has been deprecated in this version ) You may check how to move from this API to the new one from this tread - https://software.intel.com/en-us/forums/intel-integrated-performance-primitives/topic/609864

or just evaluate how this API works with IPP version 8.2 update 3 ( the latest version which contain this API ).

Kuldip_R_ · ‎02-17-2016

Before we decide to move to newer version of IPP, which is in itself a massive task for us considering usage of IPP in our project.

do you know if above is known issue and moving to newer version and using substitute function in place of deprecated will solve this issue.