Intel® Integrated Performance Primitives
Community support and discussions relating to developing high-performance vision, signal, security, and storage applications.
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.

ipps/ippi mulpack

New Contributor I


I was checking some code using FFT and mulpack, and noticed a case where results are a bit off.

both ippi and ipps mulpack versions gave the same results.

Spec: i7-4790, Ipp 8.1 32bit. win 7 pro SP1 64bit.

Using VS2013 and stepping into the disassembly, the call stack shows that ippsh99-8.1dll is loaded.

Here is an example with ipps, based on mulpack.c from the IPP 9.0 update 2 documentation.

The example transforms the vector [0,8,0,0,0,0,0,0], and multiplies the results by itself.

        IppStatus status = ippStsNoErr;
        IppsFFTSpec_R_32f* pSpec = NULL;  
        Ipp8u *pMemInit = NULL, *pBuffer = NULL, *pSpecMem = NULL; /* Pointer to the work buffers */
        int sizeSpec = 0, sizeInit = 0, sizeBuf = 0;               /* size of FFT pSpec structure, Init and work buffers */
        status = ippsFFTGetSize_R_32f(3, IPP_FFT_DIV_INV_BY_N, ippAlgHintNone, &sizeSpec, &sizeInit, &sizeBuf);
        /* memory allocation */
        pSpecMem = (Ipp8u*)ippMalloc(sizeSpec);
        pBuffer = (Ipp8u*)ippMalloc(sizeBuf);
        pMemInit = (Ipp8u*)ippMalloc(sizeInit);
        status = ippsFFTInit_R_32f(&pSpec, 3, IPP_FFT_DIV_INV_BY_N, ippAlgHintAccurate, pSpecMem, pMemInit);
        Ipp32f src[8] = { 0.0f, 8.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
        Ipp32f fft_dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };        

        status = ippsFFTFwd_RToPack_32f(src, fft_dst, pSpec, pBuffer);
        Ipp32f dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
        status = ippsMulPack_32f(fft_dst, fft_dst, dst, 8);

The results in the fft_dst and dst buffers are in packed format. the first two indices are R0 and R1, the real components of the first two elements.

fft_dst =  [8.00000000       5.65685415      -5.65685415      0.000000000      -8.00000000      -5.65685415      -5.65685415]

dst = [64.0000000  8.12035296e-007      -63.9999962      -64.0000000     -0.000000000  8.12035296e-007       63.9999962       64.0000000]

Here R1 (the second index) should be zero, but instead it is a very small number.

So fft_dst[1] = 5.65685415 + j ( -5.65685415 )

This gives fft_dst[1]*fft_dst[1] = 5.65685415 * 5.65685415 + 2 j* ( -5.65685415) + j*j*( -5.65685415 )*( -5.65685415 )

Rearranging real and imaginary parts:

fft_dst[1]*fft_dst[1] = {5.65685415 * 5.65685415 - ( -5.65685415 )*( -5.65685415 ) } + { 2 j* ( -5.65685415)}

The real part should be zero, but it is R1 = 8.12035296e-007

Old IPL (Intels image processibng library) gave the correct result.

0 Kudos
3 Replies

Hi B.k, 

Thanks for the report. We will investigate it and get back to you if any news. 

Best Regards,



Hi B K.

We investigated the results and thought, it should be normal situation for floating point – it’s just rounding error = 8.12e-7/8.0 = ~0.1e-7 – less than the weight of the least significant bit in Ipp32f mantissa related to input signal amplitude.

I guess that IPL doesn’t provide such rounding error because uses internally FPU that performs all intermediate calculations in higher precision (53 or 64 bit mantissa). 

Best Regards,




New Contributor I
Hi Ying, Thank you fo the answer. I came up with the question since we were moving from Ipp 7 to 8.1. This move included tests to the functions we use, and the one for mulpack failed. We will implement a tolerance for the error. the results for the code I posted above with Ipp7 were: dst = [64.0000000 0.000000000 -63.9999962 -64.0000000 -0.000000000 0.000000000 63.9999962 64.0000000] That is better. I guess the 63.999 results are also because of roundoff errors. When testing with Ipp 7 the program loaded ippsg9-7.0.dll. With Ipp 8.1 it was ippsh9-8.1 ( I had a typo above). The different dlls use different instructions for the calculations, so differing results are to be expected. I know floating point erorors are a known and old problem ("What Every Computer Scientist Should Know About Floating-Point Arithmetic"). Is there any article or discussion on intels website about what kind of roundoff errors one can expect with IPP? Maybe algorithms to minimize the errors?