<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic ippsAddProduct low precision in Intel® Integrated Performance Primitives</title>
    <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811706#M3944</link>
    <description>Thanks for the numerical info! &lt;BR /&gt;&lt;BR /&gt;Still, with identical inputs having zero imaginary parts, the full-double c=a*b+c with the result converted to single, and full-single c=a*b+c both give the same result. &lt;BR /&gt;&lt;BR /&gt;So I'm not sure how full-single ippsAddProduct_32fc can give a result different from full-single c=a*b+c.&lt;BR /&gt;</description>
    <pubDate>Wed, 19 Oct 2011 22:50:35 GMT</pubDate>
    <dc:creator>janw80</dc:creator>
    <dc:date>2011-10-19T22:50:35Z</dc:date>
    <item>
      <title>ippsAddProduct low precision</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811704#M3942</link>
      <description>&lt;SPAN class="sectionBodyText"&gt;Hi! On Intel IPP 6.1.2.051 on Xeon E5430 (EM64T, SSE4.1) the result of ippsAddProduct_32fc has for some reason an error of the order of +-1e-7 when compared to Matlab, ippsMul_32fc with ippsAdd_32fc_I, and direct C 32bit arithmetic as well as 64bit arithmetic that is cast to 32bit at the end. All the comparison answers are identical, only the ippsAddProduct answer is different.&lt;BR /&gt;&lt;BR /&gt;Is that a bug in IPP? Or is this to be expected (SSE4.1 DPPS instruction known to do something odd on Xeon?)&lt;BR /&gt;&lt;/SPAN&gt;&lt;BR /&gt;Reference (cast): 39359.343750000000000 + i*0.0&lt;BR /&gt;Reference (float): 39359.343750000000000 + i*0.0&lt;BR /&gt;ippsAddProduct: 39359.347656250000000 + i*0.000000000000000&lt;BR /&gt;ippsMul, ippsAdd_I: 39359.343750000000000 + i*0.000000000000000&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="sectionBodyText"&gt;The code is:&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[cpp]#include &lt;IPPS.H&gt;
#include &lt;STDIO.H&gt;
#define ACCU_RE  3.653725000000000e+04
#define A_B_RE   3.360162734985352e+01
#define A_B_IM   4.114639663696289e+01
int main(int argc, char** argv) {
   const int N = 16;
   Ipp32fc v;

   Ipp32fc* a = ippsMalloc_32fc(N);
   Ipp32fc* b = ippsMalloc_32fc(N);
   Ipp32fc* accu_fma = ippsMalloc_32fc(N);
   Ipp32fc* accu_muladd = ippsMalloc_32fc(N);
   Ipp32fc* tmp = ippsMalloc_32fc(N);
   v.re = ACCU_RE;
   v.im = 0;
   ippsSet_32fc(v, accu_fma, N);
   ippsSet_32fc(v, accu_muladd, N);
   v.re = A_B_RE;
   v.im = A_B_IM;
   ippsSet_32fc(v, a, N);

   v.re = A_B_RE;
   v.im = -A_B_IM;
   ippsSet_32fc(v, b, N);

   double ref = ACCU_RE + (A_B_RE*A_B_RE + A_B_IM*A_B_IM);
   float  refF = float(ACCU_RE) + (float(A_B_RE)*float(A_B_RE) + float(A_B_IM)*float(A_B_IM));
   ippsAddProduct_32fc(a, b, accu_fma, N);
   ippsMul_32fc(a, b, tmp, N);
   ippsAdd_32fc_I(tmp, accu_muladd, N);

   printf("Reference (cast):   %6.15f + i*0.0\n", float(ref));
   printf("Reference (float):  %6.15f + i*0.0\n", refF);
   printf("ippsAddProduct:     %6.15f + i*%6.15f\n", accu_fma[0].re, accu_fma[0].im);
   printf("ippsMul, ippsAdd_I: %6.15f + i*%6.15f\n", accu_muladd[0].re, accu_muladd[0].im);

   return 0;
}
[/cpp]&lt;/STDIO.H&gt;&lt;/IPPS.H&gt;&lt;/PRE&gt; &lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Sun, 16 Oct 2011 13:34:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811704#M3942</guid>
      <dc:creator>janw80</dc:creator>
      <dc:date>2011-10-16T13:34:56Z</dc:date>
    </item>
    <item>
      <title>ippsAddProduct low precision</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811705#M3943</link>
      <description>&lt;P&gt;Hello, &lt;/P&gt;&lt;P&gt;For single precision float point data, it only 23 bit ( 1e-7) for the data.&lt;/P&gt;&lt;P&gt;&lt;A href="http://en.wikipedia.org/wiki/IEEE_754-1985" target="_blank"&gt;http://en.wikipedia.org/wiki/IEEE_754-1985&lt;/A&gt;&lt;/P&gt;&lt;P&gt;so, it is fine if you have data precision around 1e-7. If you want to have more high precision, you can use double precision for the computation. &lt;/P&gt;&lt;P&gt;Thanks,&lt;BR /&gt;Chao&lt;/P&gt;</description>
      <pubDate>Wed, 19 Oct 2011 07:26:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811705#M3943</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2011-10-19T07:26:24Z</dc:date>
    </item>
    <item>
      <title>ippsAddProduct low precision</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811706#M3944</link>
      <description>Thanks for the numerical info! &lt;BR /&gt;&lt;BR /&gt;Still, with identical inputs having zero imaginary parts, the full-double c=a*b+c with the result converted to single, and full-single c=a*b+c both give the same result. &lt;BR /&gt;&lt;BR /&gt;So I'm not sure how full-single ippsAddProduct_32fc can give a result different from full-single c=a*b+c.&lt;BR /&gt;</description>
      <pubDate>Wed, 19 Oct 2011 22:50:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811706#M3944</guid>
      <dc:creator>janw80</dc:creator>
      <dc:date>2011-10-19T22:50:35Z</dc:date>
    </item>
    <item>
      <title>ippsAddProduct low precision</title>
      <link>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811707#M3945</link>
      <description>That is possible when binary representations of results in IEEE 754 for single- and double-precision floatsare identical.&lt;BR /&gt;&lt;BR /&gt;Here is an opposite case:&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="text-decoration: underline;"&gt;16968003&lt;/SPAN&gt;(Base10) = 0x4B8174A2(Base16) =&amp;gt; 0 10010111 00000010111010010100010(Base2\IEEE754)&lt;BR /&gt;&lt;SPAN style="text-decoration: underline;"&gt;16968004&lt;/SPAN&gt;(Base10) = 0x4B8174A2(Base16) =&amp;gt; 0 10010111 00000010111010010100010(Base2\IEEE754)&lt;BR /&gt;&lt;SPAN style="text-decoration: underline;"&gt;16968005&lt;/SPAN&gt;(Base10) = 0x4B8174A2(Base16) =&amp;gt; 0 10010111 00000010111010010100010(Base2\IEEE754)&lt;BR /&gt;&lt;BR /&gt;A precision loss happened because source numbers are greater than 2^24.</description>
      <pubDate>Fri, 04 Nov 2011 01:48:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Integrated-Performance/ippsAddProduct-low-precision/m-p/811707#M3945</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-11-04T01:48:52Z</dc:date>
    </item>
  </channel>
</rss>

