Intel® Integrated Performance Primitives
Deliberate problems developing high-performance vision, signal, security, and storage applications.

ippsVectorSlope_32f gives bad results

jeffc111
Beginner
958 Views

ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f)
pResult[6000] contains0.124999493 - That's bad.
ippsVectorSlope_32f(pResult, 8192, 0, 1);
ippsDivC_32f_I(48000f, pResult, 8192);
pResult[6000] contains 0.125 - that's good.
Any ideas?
0 Kudos
7 Replies
Gennady_F_Intel
Moderator
958 Views
This the resultd of rounding float precision. UseippsVectorSlope_64f and you will have 0.125.
0 Kudos
jeffc111
Beginner
958 Views
Perhaps VectorSlope is not computing the value as documented ( result = offset + n*slope), but is avoiding multiplication by iteratively adding the slope value? That would DEFINITELY lead to floating point precision errors.
Further evidence that rounding and floating point precision are not the problem here:

I compute 6000/48000 in floating point, then the value is correct.
float result = 6000/48000
result == .125f
If I replace vectorSlope with the following, the value is correct.
float slope = 1/48000f;
for (int i = 0; i < 8192; i++ )
{
pResult = i * slope;
}
pResult[6000] == .125f;
And, as I said, when performing VectorSlope with integers, THEN dividing by 48000, the value is correct.
In fact, if I perform Vector Slope with integers, then Multiply by 1/48000f, the value is correct.
I've tried this in C with Floating Point Model set to "precise", "strict", and "fast", just for kicks, and they all give 0.125
0 Kudos
jeffc111
Beginner
958 Views
Any more info on this?
I can imagine that as an optimization VectorSlope might suffer from loss of precision that is maintained when the separate steps of computing VectorSlope then DivC are performed. However, this seems unlikely, or at the least worth noting.
0 Kudos
jeffc111
Beginner
958 Views
When you say it's a result of rounding - does that mean the rounding mode is different for IPP vs. the c code I've tested against? Is this configurable?
0 Kudos
SergeyKostrov
Valued Contributor II
958 Views
Quoting jeffc111
...
Perhaps VectorSlope is not computing the value as documented ( result = offset + n*slope), but is avoiding multiplication by iteratively adding the slope value?
...


It looks like in order to achieve as better as possible performance Intel's software developers could do that.

0 Kudos
igorastakhov
New Contributor II
958 Views

I don't think that difference in one low-order bit in mantissa is bad result. Yes, IPP uses optimized algorithm for the best achievable performance - so if you need better accuracy - use 64f data type or another approach. IPP provides some tradeof between performance and accuracy with shift to performance. In this particular case the difference in one low-order mantissa bit is because of unrolling on the SSE register width - if result is not satisfactory for you - you always can switch to VectorSlope_64f with convertion to 32f at the final stage - you'll loss ~2x in performance.

Regards,
Igor

0 Kudos
SergeyKostrov
Valued Contributor II
958 Views
Quoting jeffc111
ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f)
pResult[6000] contains0.124999493 - That's bad.
ippsVectorSlope_32f(pResult, 8192, 0, 1);
ippsDivC_32f_I(48000f, pResult, 8192);
pResult[6000] contains 0.125 - That's good.
Any ideas?


Please take a look at results of my investigation:

1. This is how these numbers look like in Base10, Base 16 and Base2 ( Binary ):

Sign Exponent Mantissa
0.125000000 (Base10) = 0x3F900000 (Base16) = 0 01111111 00100000000000000000000 (Base2)
0.124999493 (Base10) = 0x3DFFFFBC (Base16) = 0 01111011 11111111111111110111100 (Base2)

2. Absolute Error is ( 0.124999493 - 0.125000000 ) = -0.000000507
Epsilon for a Single-Precision FP number is 1.192092890e-07

If I divide |0.000000507| by 1.192092890e-07 it gives ~4.25.

( Exact value is: 4.2530242756501970244952975099113 )

So, the absolute error is greater than Epsilon for aSingle-Precision FP number in ~4.25 times!

I think the case 'ippsVectorSlope_32f(pResult, 8192, 0, 1/48000f) => 0.124999493' has to be investigated
by an Intel software engineer.

Note:
More accurate representation for 0.124999493 according to IEEE 754 standard is 1.249994933605194091796875E-1

0 Kudos
Reply