Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6765 Discussions

unexpected results with intel primitive (when dividing by 1)

Josephnyu
Novice
1,536 Views

I'm using Intel Integrated Performance Primitives (windows, v2019 (2019.0.5.1052), 64bit) and I came across unexpected results.

When I run the following code

    std::cout << std::setprecision(20);
    const int N = 8;
    const std::vector<double> one(N, 1.0);
    std::vector<double> d(N,0.014058305571221497);
    
    std::cout << "before division by 1 : " << d[0] << std::endl;
    ippsDiv_64f_I(one.data(), d.data(), N);
    std::cout << "after division by 1 : " << d[0] << std::endl;

I get the following output :

before division by 1 : 0.014058305571221497293 
after division by 1 : 0.014058305571221500763

1/ How dividing by 1 can change the value ?

2/ For any value of N between 1 and 7, dividing by 1 has no effect. Only for N>7 I get this. How can it be that the result of a division of doubles depend on the size of my array ?

Could it be a bug ?

Note : I tested with v2022 and I got the same results

7 Replies
Ruqiu_C_Intel
Moderator
1,498 Views

Hello Josephnyu,

Thank you for posting your concern here.

We have reproduced it, and will investigate further internally for the accuracy.

 

Regards,

Ruqiu

Chao_Y_Intel
Moderator
1,424 Views

Hi Josephnyu,

In the double-precision floating-point format, the fractional part is 52 bits: https://en.wikipedia.org/wiki/Double-precision_floating-point_format

The precision of this format is around 15 decimal digits: 2^52 ~ 10^15.

IPP functions are optimized using various instructions, which may result in minor rounding differences at the least significant bits. In the test code provided, the difference is around the 10^-15 level. This looks an acceptable numerical discrepancy.

For larger values of N, such as N >= 8, vectorized instructions (AVX512/AVX2) can be utilized. For smaller values of N, scalar instructions are used. so the final result may have differences within its precision range.

 

Thanks,

Chao

0 Kudos
Josephnyu
Novice
1,416 Views

Thanks @Chao_Y_Intel 

 

The difference between the result that I get and the expected results is 2ULP. So the "rounding" impacts more than 1 bit.
Deciding if the discrepancy is acceptable or not depends highly on the usage. If I divide 1milion times by 1 can I get very far from my starting point ?


Does intel provide any guarantee about the error of the results ? If yes can you please share tem.
I'd like also confirmation that there is no bug and that's those results are compliant with the specs of IPP.

0 Kudos
Gennady_F_Intel
Moderator
1,368 Views

you could try to check the behavior with  ippsDiv_64f_A53 function which guarantees 53 correctly rounded bits of significand, including the implied bit, with the maximum guaranteed error within 1 ulp.

The API of this function is slightly different compare with ippsDiv_64f_I.

 

0 Kudos
Josephnyu
Novice
1,356 Views

Thanks for the suggestion @Gennady_F_Intel.

 

I tested  ippsDiv_64f_A50 and  ippsDiv_64f_A53 on this test case and they returned the expected results.
Perf of ippsDiv_64f_A50 is similar ippsDiv_64f_I.

However perf of ippsDiv_64f_A53 is worst by ~8%.

 

Can you tell how many ulps are gauranteed by ippsDiv_64f_I ?

0 Kudos
Chao_Y_Intel
Moderator
1,389 Views

thanks,  Josephyu.  We are checking it with the function owner, to make sure the result is expected. 

0 Kudos
Ruqiu_C_Intel
Moderator
596 Views

Hi Josephnyu,


Thank you for your patience.

As IPP uses the following 'optimized' algorithm to calculate (~in parallel) a0/b0, a1/b1, …, a7/b7. Looking only at a0/b0, the computation is:

 

                                                 1

              a0/b0 ~= ((( ------------------------ * (b2*b3))*b1)*a0)

                                   ((b0*b1)*(b2*b3))

Each pair of parentheses indicates a rounding-to-nearest operation; we have 7 floating-point operations - 6 multiplications and one division; but the error in (b2*b3) is the same at the numerator and denominator, so it cancels out. The maximum ulp error is 6 ulps. 



Regards,

Ruqiu


0 Kudos
Reply