Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

b_k_

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-15-2016
12:55 AM

91 Views

ipps/ippi mulpack

Hello,

I was checking some code using FFT and mulpack, and noticed a case where results are a bit off.

both ippi and ipps mulpack versions gave the same results.

Spec: i7-4790, Ipp 8.1 32bit. win 7 pro SP1 64bit.

Using VS2013 and stepping into the disassembly, the call stack shows that ippsh99-8.1dll is loaded.

Here is an example with ipps, based on mulpack.c from the IPP 9.0 update 2 documentation.

The example transforms the vector [0,8,0,0,0,0,0,0], and multiplies the results by itself.

IppStatus status = ippStsNoErr; IppsFFTSpec_R_32f* pSpec = NULL; Ipp8u *pMemInit = NULL, *pBuffer = NULL, *pSpecMem = NULL; /* Pointer to the work buffers */ int sizeSpec = 0, sizeInit = 0, sizeBuf = 0; /* size of FFT pSpec structure, Init and work buffers */ status = ippsFFTGetSize_R_32f(3, IPP_FFT_DIV_INV_BY_N, ippAlgHintNone, &sizeSpec, &sizeInit, &sizeBuf); /* memory allocation */ pSpecMem = (Ipp8u*)ippMalloc(sizeSpec); pBuffer = (Ipp8u*)ippMalloc(sizeBuf); pMemInit = (Ipp8u*)ippMalloc(sizeInit); status = ippsFFTInit_R_32f(&pSpec, 3, IPP_FFT_DIV_INV_BY_N, ippAlgHintAccurate, pSpecMem, pMemInit); Ipp32f src[8] = { 0.0f, 8.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f }; Ipp32f fft_dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f }; status = ippsFFTFwd_RToPack_32f(src, fft_dst, pSpec, pBuffer); ippFree(pMemInit); ippFree(pSpec); ippFree(pBuffer); Ipp32f dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f }; status = ippsMulPack_32f(fft_dst, fft_dst, dst, 8);

The results in the fft_dst and dst buffers are in packed format. the first two indices are R0 and R1, the real components of the first two elements.

fft_dst = [8.00000000 5.65685415 -5.65685415 0.000000000 -8.00000000 -5.65685415 -5.65685415]

dst = [64.0000000 8.12035296e-007 -63.9999962 -64.0000000 -0.000000000 8.12035296e-007 63.9999962 64.0000000]

Here R1 (the second index) should be zero, but instead it is a very small number.

So fft_dst[1] = 5.65685415 + j ( -5.65685415 )

This gives fft_dst[1]*fft_dst[1] = 5.65685415 * 5.65685415 + 2 j* ( -5.65685415) + j*j*( -5.65685415 )*( -5.65685415 )

Rearranging real and imaginary parts:

fft_dst[1]*fft_dst[1] = {5.65685415 * 5.65685415 - ( -5.65685415 )*( -5.65685415 ) } + { 2 j* ( -5.65685415)}

The real part should be zero, but it is R1 = 8.12035296e-007

Old IPL (Intels image processibng library) gave the correct result.

Link Copied

3 Replies

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-17-2016
01:11 AM

91 Views

Hi B.k,

Thanks for the report. We will investigate it and get back to you if any news.

Best Regards,

Ying

Ying_H_Intel

Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-20-2016
08:29 PM

91 Views

Hi B K.

We investigated the results and thought, it should be normal situation for floating point – it’s just rounding error = 8.12e-7/8.0 = ~0.1e-7 – less than the weight of the least significant bit in Ipp32f mantissa related to input signal amplitude.

I guess that IPL doesn’t provide such rounding error because uses internally FPU that performs all intermediate calculations in higher precision (53 or 64 bit mantissa).

Best Regards,

Ying

b_k_

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

03-21-2016
02:34 AM

91 Views

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.