- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi! On Intel IPP 6.1.2.051 on Xeon E5430 (EM64T, SSE4.1) the result of ippsAddProduct_32fc has for some reason an error of the order of +-1e-7 when compared to Matlab, ippsMul_32fc with ippsAdd_32fc_I, and direct C 32bit arithmetic as well as 64bit arithmetic that is cast to 32bit at the end. All the comparison answers are identical, only the ippsAddProduct answer is different.
Is that a bug in IPP? Or is this to be expected (SSE4.1 DPPS instruction known to do something odd on Xeon?)
Reference (cast): 39359.343750000000000 + i*0.0
Reference (float): 39359.343750000000000 + i*0.0
ippsAddProduct: 39359.347656250000000 + i*0.000000000000000
ippsMul, ippsAdd_I: 39359.343750000000000 + i*0.000000000000000
The code is:
Is that a bug in IPP? Or is this to be expected (SSE4.1 DPPS instruction known to do something odd on Xeon?)
Reference (cast): 39359.343750000000000 + i*0.0
Reference (float): 39359.343750000000000 + i*0.0
ippsAddProduct: 39359.347656250000000 + i*0.000000000000000
ippsMul, ippsAdd_I: 39359.343750000000000 + i*0.000000000000000
The code is:
[cpp]#include#include #define ACCU_RE 3.653725000000000e+04 #define A_B_RE 3.360162734985352e+01 #define A_B_IM 4.114639663696289e+01 int main(int argc, char** argv) { const int N = 16; Ipp32fc v; Ipp32fc* a = ippsMalloc_32fc(N); Ipp32fc* b = ippsMalloc_32fc(N); Ipp32fc* accu_fma = ippsMalloc_32fc(N); Ipp32fc* accu_muladd = ippsMalloc_32fc(N); Ipp32fc* tmp = ippsMalloc_32fc(N); v.re = ACCU_RE; v.im = 0; ippsSet_32fc(v, accu_fma, N); ippsSet_32fc(v, accu_muladd, N); v.re = A_B_RE; v.im = A_B_IM; ippsSet_32fc(v, a, N); v.re = A_B_RE; v.im = -A_B_IM; ippsSet_32fc(v, b, N); double ref = ACCU_RE + (A_B_RE*A_B_RE + A_B_IM*A_B_IM); float refF = float(ACCU_RE) + (float(A_B_RE)*float(A_B_RE) + float(A_B_IM)*float(A_B_IM)); ippsAddProduct_32fc(a, b, accu_fma, N); ippsMul_32fc(a, b, tmp, N); ippsAdd_32fc_I(tmp, accu_muladd, N); printf("Reference (cast): %6.15f + i*0.0\n", float(ref)); printf("Reference (float): %6.15f + i*0.0\n", refF); printf("ippsAddProduct: %6.15f + i*%6.15f\n", accu_fma[0].re, accu_fma[0].im); printf("ippsMul, ippsAdd_I: %6.15f + i*%6.15f\n", accu_muladd[0].re, accu_muladd[0].im); return 0; } [/cpp]
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
For single precision float point data, it only 23 bit ( 1e-7) for the data.
http://en.wikipedia.org/wiki/IEEE_754-1985
so, it is fine if you have data precision around 1e-7. If you want to have more high precision, you can use double precision for the computation.
Thanks,
Chao
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the numerical info!
Still, with identical inputs having zero imaginary parts, the full-double c=a*b+c with the result converted to single, and full-single c=a*b+c both give the same result.
So I'm not sure how full-single ippsAddProduct_32fc can give a result different from full-single c=a*b+c.
Still, with identical inputs having zero imaginary parts, the full-double c=a*b+c with the result converted to single, and full-single c=a*b+c both give the same result.
So I'm not sure how full-single ippsAddProduct_32fc can give a result different from full-single c=a*b+c.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That is possible when binary representations of results in IEEE 754 for single- and double-precision floatsare identical.
Here is an opposite case:
16968003(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
16968004(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
16968005(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
A precision loss happened because source numbers are greater than 2^24.
Here is an opposite case:
16968003(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
16968004(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
16968005(Base10) = 0x4B8174A2(Base16) => 0 10010111 00000010111010010100010(Base2\IEEE754)
A precision loss happened because source numbers are greater than 2^24.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page