ippsVectorSlope_32f gives bad results

jeffc111 · ‎09-28-2011

ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f)

pResult[6000] contains0.124999493 - That's bad.

ippsVectorSlope_32f(pResult, 8192, 0, 1);

ippsDivC_32f_I(48000f, pResult, 8192);

pResult[6000] contains 0.125 - that's good.

Any ideas?

Gennady_F_Intel · ‎09-28-2011

This the resultd of rounding float precision. UseippsVectorSlope_64f and you will have 0.125.

jeffc111 · ‎09-29-2011

Perhaps VectorSlope is not computing the value as documented ( result = offset + n*slope), but is avoiding multiplication by iteratively adding the slope value? That would DEFINITELY lead to floating point precision errors.

Further evidence that rounding and floating point precision are not the problem here:

I compute 6000/48000 in floating point, then the value is correct.

float result = 6000/48000

result == .125f

If I replace vectorSlope with the following, the value is correct.

float slope = 1/48000f;

for (int i = 0; i < 8192; i++ )

{

pResult = i * slope;

}

pResult[6000] == .125f;

And, as I said, when performing VectorSlope with integers, THEN dividing by 48000, the value is correct.

In fact, if I perform Vector Slope with integers, then Multiply by 1/48000f, the value is correct.

I've tried this in C with Floating Point Model set to "precise", "strict", and "fast", just for kicks, and they all give 0.125

jeffc111 · ‎10-11-2011

Any more info on this?

I can imagine that as an optimization VectorSlope might suffer from loss of precision that is maintained when the separate steps of computing VectorSlope then DivC are performed. However, this seems unlikely, or at the least worth noting.

jeffc111 · ‎02-08-2012

When you say it's a result of rounding - does that mean the rounding mode is different for IPP vs. the c code I've tested against? Is this configurable?

SergeyKostrov · ‎02-08-2012

Quoting jeffc111

...
Perhaps VectorSlope is not computing the value as documented ( result = offset + n*slope), but is avoiding multiplication by iteratively adding the slope value?
...

It looks like in order to achieve as better as possible performance Intel's software developers could do that.

igorastakhov · ‎02-09-2012

I don't think that difference in one low-order bit in mantissa is bad result. Yes, IPP uses optimized algorithm for the best achievable performance - so if you need better accuracy - use 64f data type or another approach. IPP provides some tradeof between performance and accuracy with shift to performance. In this particular case the difference in one low-order mantissa bit is because of unrolling on the SSE register width - if result is not satisfactory for you - you always can switch to VectorSlope_64f with convertion to 32f at the final stage - you'll loss ~2x in performance.

Regards,
Igor

SergeyKostrov · ‎02-10-2012

Quoting jeffc111

ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f)
pResult[6000] contains0.124999493 - That's bad.
ippsVectorSlope_32f(pResult, 8192, 0, 1);
ippsDivC_32f_I(48000f, pResult, 8192);
pResult[6000] contains 0.125 - That's good.
Any ideas?

Please take a look at results of my investigation:

1. This is how these numbers look like in Base10, Base 16 and Base2 ( Binary ):

Sign Exponent Mantissa
0.125000000 (Base10) = 0x3F900000 (Base16) = 0 01111111 00100000000000000000000 (Base2)
0.124999493 (Base10) = 0x3DFFFFBC (Base16) = 0 01111011 11111111111111110111100 (Base2)

2. Absolute Error is ( 0.124999493 - 0.125000000 ) = -0.000000507
Epsilon for a Single-Precision FP number is 1.192092890e-07

If I divide |0.000000507| by 1.192092890e-07 it gives ~4.25.

( Exact value is: 4.2530242756501970244952975099113 )

So, the absolute error is greater than Epsilon for aSingle-Precision FP number in ~4.25 times!

I think the case 'ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f) => 0.124999493' has to be investigated
by an Intel software engineer.

Note:
More accurate representation for 0.124999493 according to IEEE 754 standard is 1.249994933605194091796875E-1