Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.
6815 Discussions

unexpected results with intel primitive (when dividing by 1)

Josephnyu
Novice
1 959 Visites

I'm using Intel Integrated Performance Primitives (windows, v2019 (2019.0.5.1052), 64bit) and I came across unexpected results.

When I run the following code

    std::cout << std::setprecision(20);
    const int N = 8;
    const std::vector<double> one(N, 1.0);
    std::vector<double> d(N,0.014058305571221497);
    
    std::cout << "before division by 1 : " << d[0] << std::endl;
    ippsDiv_64f_I(one.data(), d.data(), N);
    std::cout << "after division by 1 : " << d[0] << std::endl;

I get the following output :

before division by 1 : 0.014058305571221497293 
after division by 1 : 0.014058305571221500763

1/ How dividing by 1 can change the value ?

2/ For any value of N between 1 and 7, dividing by 1 has no effect. Only for N>7 I get this. How can it be that the result of a division of doubles depend on the size of my array ?

Could it be a bug ?

Note : I tested with v2022 and I got the same results

7 Réponses
Ruqiu_C_Intel
Modérateur
1 921 Visites

Hello Josephnyu,

Thank you for posting your concern here.

We have reproduced it, and will investigate further internally for the accuracy.

 

Regards,

Ruqiu

Chao_Y_Intel
Modérateur
1 847 Visites

Hi Josephnyu,

In the double-precision floating-point format, the fractional part is 52 bits: https://en.wikipedia.org/wiki/Double-precision_floating-point_format

The precision of this format is around 15 decimal digits: 2^52 ~ 10^15.

IPP functions are optimized using various instructions, which may result in minor rounding differences at the least significant bits. In the test code provided, the difference is around the 10^-15 level. This looks an acceptable numerical discrepancy.

For larger values of N, such as N >= 8, vectorized instructions (AVX512/AVX2) can be utilized. For smaller values of N, scalar instructions are used. so the final result may have differences within its precision range.

 

Thanks,

Chao

0 Compliments
Josephnyu
Novice
1 839 Visites

Thanks @Chao_Y_Intel 

 

The difference between the result that I get and the expected results is 2ULP. So the "rounding" impacts more than 1 bit.
Deciding if the discrepancy is acceptable or not depends highly on the usage. If I divide 1milion times by 1 can I get very far from my starting point ?


Does intel provide any guarantee about the error of the results ? If yes can you please share tem.
I'd like also confirmation that there is no bug and that's those results are compliant with the specs of IPP.

0 Compliments
Gennady_F_Intel
Modérateur
1 791 Visites

you could try to check the behavior with  ippsDiv_64f_A53 function which guarantees 53 correctly rounded bits of significand, including the implied bit, with the maximum guaranteed error within 1 ulp.

The API of this function is slightly different compare with ippsDiv_64f_I.

 

0 Compliments
Josephnyu
Novice
1 779 Visites

Thanks for the suggestion @Gennady_F_Intel.

 

I tested  ippsDiv_64f_A50 and  ippsDiv_64f_A53 on this test case and they returned the expected results.
Perf of ippsDiv_64f_A50 is similar ippsDiv_64f_I.

However perf of ippsDiv_64f_A53 is worst by ~8%.

 

Can you tell how many ulps are gauranteed by ippsDiv_64f_I ?

0 Compliments
Chao_Y_Intel
Modérateur
1 812 Visites

thanks,  Josephyu.  We are checking it with the function owner, to make sure the result is expected. 

0 Compliments
Ruqiu_C_Intel
Modérateur
1 019 Visites

Hi Josephnyu,


Thank you for your patience.

As IPP uses the following 'optimized' algorithm to calculate (~in parallel) a0/b0, a1/b1, …, a7/b7. Looking only at a0/b0, the computation is:

 

                                                 1

              a0/b0 ~= ((( ------------------------ * (b2*b3))*b1)*a0)

                                   ((b0*b1)*(b2*b3))

Each pair of parentheses indicates a rounding-to-nearest operation; we have 7 floating-point operations - 6 multiplications and one division; but the error in (b2*b3) is the same at the numerator and denominator, so it cancels out. The maximum ulp error is 6 ulps. 



Regards,

Ruqiu


0 Compliments
Répondre