<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi B K. in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110846#M25415</link>
    <description>&lt;P&gt;Hi B K.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;We investigated the results and thought, it should be normal situation for floating point – it’s just rounding error = 8.12e-7/8.0 = ~0.1e-7 – less than the weight of the least significant bit in Ipp32f mantissa related to input signal amplitude.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I guess that IPL doesn’t provide such rounding error because uses internally FPU that performs all intermediate calculations in higher precision (53 or 64 bit mantissa).&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 21 Mar 2016 03:29:29 GMT</pubDate>
    <dc:creator>Ying_H_Intel</dc:creator>
    <dc:date>2016-03-21T03:29:29Z</dc:date>
    <item>
      <title>ipps/ippi mulpack</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110844#M25413</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I was checking some code using FFT and mulpack, and noticed a case where results are a bit off.&lt;/P&gt;

&lt;P&gt;both ippi and ipps mulpack versions gave the same results.&lt;/P&gt;

&lt;P&gt;Spec: i7-4790, Ipp 8.1 32bit. win 7 pro SP1 64bit.&lt;/P&gt;

&lt;P&gt;Using VS2013 and stepping into the disassembly, the call stack shows that ippsh99-8.1dll is loaded.&lt;/P&gt;

&lt;P&gt;Here is an example with ipps, based on &lt;A href="https://software.intel.com/en-us/node/610240"&gt;mulpack.c&lt;/A&gt; from the IPP 9.0 update 2 documentation.&lt;/P&gt;

&lt;P&gt;The example transforms the vector [0,8,0,0,0,0,0,0], and multiplies the results by itself.&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;        IppStatus status = ippStsNoErr;
        IppsFFTSpec_R_32f* pSpec = NULL;  
        Ipp8u *pMemInit = NULL, *pBuffer = NULL, *pSpecMem = NULL; /* Pointer to the work buffers */
        int sizeSpec = 0, sizeInit = 0, sizeBuf = 0;               /* size of FFT pSpec structure, Init and work buffers */
        status = ippsFFTGetSize_R_32f(3, IPP_FFT_DIV_INV_BY_N, ippAlgHintNone, &amp;amp;sizeSpec, &amp;amp;sizeInit, &amp;amp;sizeBuf);
        /* memory allocation */
        pSpecMem = (Ipp8u*)ippMalloc(sizeSpec);
        pBuffer = (Ipp8u*)ippMalloc(sizeBuf);
        pMemInit = (Ipp8u*)ippMalloc(sizeInit);
        status = ippsFFTInit_R_32f(&amp;amp;pSpec, 3, IPP_FFT_DIV_INV_BY_N, ippAlgHintAccurate, pSpecMem, pMemInit);
        
        Ipp32f src[8] = { 0.0f, 8.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
        Ipp32f fft_dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };        

        status = ippsFFTFwd_RToPack_32f(src, fft_dst, pSpec, pBuffer);
        ippFree(pMemInit);
        ippFree(pSpec);
        ippFree(pBuffer);
        
        Ipp32f dst[8] = { 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f, 0.0f };
        
        status = ippsMulPack_32f(fft_dst, fft_dst, dst, 8);&lt;/PRE&gt;

&lt;P&gt;The results in the fft_dst and dst buffers are in packed format. the first two indices are R0 and R1, the real components of the first two elements.&lt;/P&gt;

&lt;P&gt;fft_dst =&amp;nbsp; [8.00000000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5.65685415&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -5.65685415&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.000000000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -8.00000000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -5.65685415&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -5.65685415]&lt;/P&gt;

&lt;P&gt;dst = [64.0000000&amp;nbsp; 8.12035296e-007&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -63.9999962&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -64.0000000&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -0.000000000&amp;nbsp; 8.12035296e-007&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 63.9999962&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 64.0000000]&lt;/P&gt;

&lt;P&gt;Here R1 (the second index) should be zero, but instead it is a very small number.&lt;/P&gt;

&lt;P&gt;So fft_dst[1] = 5.65685415 + j ( -5.65685415 )&lt;/P&gt;

&lt;P&gt;This gives fft_dst[1]*fft_dst[1] = 5.65685415 * 5.65685415 + 2 j* ( -5.65685415) + j*j*( -5.65685415 )*( -5.65685415 )&lt;/P&gt;

&lt;P&gt;Rearranging real and imaginary parts:&lt;/P&gt;

&lt;P&gt;fft_dst[1]*fft_dst[1] = {5.65685415 * 5.65685415 - ( -5.65685415 )*( -5.65685415 ) } + { 2 j* ( -5.65685415)}&lt;/P&gt;

&lt;P&gt;The real part should be zero, but it is R1 = 8.12035296e-007&lt;/P&gt;

&lt;P&gt;Old IPL (Intels image processibng library) gave the correct result.&lt;/P&gt;</description>
      <pubDate>Tue, 15 Mar 2016 07:55:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110844#M25413</guid>
      <dc:creator>b_k_</dc:creator>
      <dc:date>2016-03-15T07:55:18Z</dc:date>
    </item>
    <item>
      <title>Hi B.k, </title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110845#M25414</link>
      <description>&lt;P&gt;Hi B.k,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks for the report. We will investigate it and get back to you if any news.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 17 Mar 2016 08:11:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110845#M25414</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2016-03-17T08:11:46Z</dc:date>
    </item>
    <item>
      <title>Hi B K.</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110846#M25415</link>
      <description>&lt;P&gt;Hi B K.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;We investigated the results and thought, it should be normal situation for floating point – it’s just rounding error = 8.12e-7/8.0 = ~0.1e-7 – less than the weight of the least significant bit in Ipp32f mantissa related to input signal amplitude.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I guess that IPL doesn’t provide such rounding error because uses internally FPU that performs all intermediate calculations in higher precision (53 or 64 bit mantissa).&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 21 Mar 2016 03:29:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110846#M25415</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2016-03-21T03:29:29Z</dc:date>
    </item>
    <item>
      <title>Hi Ying,</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110847#M25416</link>
      <description>Hi Ying,
Thank you fo the answer.
I came up with the question since we were moving from Ipp 7 to 8.1.
This move included tests to the functions we use, and the one for mulpack failed.
We will implement a tolerance for the error.
the results for the code I posted above with Ipp7 were:
dst = [64.0000000      0.000000000      -63.9999962      -64.0000000     -0.000000000      0.000000000       63.9999962       64.0000000]
That is better. I guess the 63.999 results are also because of roundoff errors.
When testing with Ipp 7 the program loaded ippsg9-7.0.dll.
With Ipp 8.1 it was ippsh9-8.1 ( I had a typo above).
The different dlls use different instructions for the calculations, so differing results are to be expected.

I know floating point erorors are a known and old problem ("What Every Computer Scientist Should Know About Floating-Point Arithmetic").
Is there any article or discussion on intels website about what kind of roundoff errors one can expect with IPP?
Maybe algorithms to minimize the errors?</description>
      <pubDate>Mon, 21 Mar 2016 09:34:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ipps-ippi-mulpack/m-p/1110847#M25416</guid>
      <dc:creator>b_k_</dc:creator>
      <dc:date>2016-03-21T09:34:51Z</dc:date>
    </item>
  </channel>
</rss>

