Software Archive
Read-only legacy content
17061 Discussions

IEEE 754 Standard Compliance: CPU vs GPU or, a War between Intel and NVIDIA

SergeyKostrov
Valued Contributor II
9,808 Views
*** IEEE 754 Standard Compliance: CPU vs GPU or, a War between Intel and NVIDIA *** [ Abstract ] In 2011 NVIDIA published an article about compliance of Floating-Point arithmetic on GPUs and also compared it with Floating-Point arithmetic on CPUs. The article is very good but it has lots of errors, "crafted" test cases to demonstrate that CPUs have some issues and GPUs do not, and some technical information is obsolete. Even if the article was last updated in 2015 non of errors I've found are still Not fixed. My review of the article will be submitted later on.,
0 Kudos
51 Replies
SergeyKostrov
Valued Contributor II
2,321 Views
[ Intel C++ compiler v12.1.7 ( u371 ) 32-bit - Release ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000000000000 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000000000000 ... Note: Floating Point Model option: /fp:fast=2
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ MinGW C++ compiler v5.1.0 32-bit - Debug ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000014210855 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000014210855 ... Note: Floating Point Model option: Default
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ MinGW C++ compiler v5.1.0 32-bit - Release ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000000000000 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000000000000 ... Note: Floating Point Model option: -ffast_math
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Watcom C++ compiler v2.0.0 32-bit - Debug ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000014210855 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000014210855 ... Note: Floating Point Model options: -fpi87 ( inline 80x87 ) and -fp5 ( optimize f-p for Pentium )
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Watcom C++ compiler v2.0.0 32-bit - Release ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000014210855 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000014210855 ... Note: Floating Point Model options: -fpi87 ( inline 80x87 ) and -fp5 ( optimize f-p for Pentium )
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ FPU x87 related inaccuracy ] ... ...while x87 operations often use an additional internal 80-bit precision format... ... Here is my comment: An expression '...often use...' is Incorrect because FPU always uses long double Extended precision internally. Take a look at Intel SDM for more technical details.
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Turbo C++ compiler v3.0.0 16-bit - Debug ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000014210855 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000014210855 ... Note: Floating Point Model option: Default
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Turbo C++ compiler v3.0.0 16-bit - Release ] ... Verification 1.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a1= 1.0000001 b1=-1.0000002 r1= 0.000000000000014210855 Verification 2.1 of IEEE-754 Standard for SP ( 24-bit ) arithmetic: a2= 1.0000001 b2=-1.0000002 r2= 0.000000000000014210855 ... Note: Floating Point Model option: Default
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Conclusion ] My conclusion is very short: A software developer from NVIDIA wrote a very good article about issues related to FP computing but I feel that he wanted to put as many as possible "black spots" on a wonderful world of CPU-Bound programming.
0 Kudos
zalia64
New Contributor I
2,321 Views

A few notes.

(1) When You evaluate  ' Cos( 1.0 + 10*2*PI)' , it differs then evaluating 'Cos(  63.28301853071795858696 )', because the CPU MUST first compute    10*2*PI, then add it to 1.0, and suffer a round-up-error of 2^(-52)*(62+1) by that operation. So your table of Cos() -errors actually measured the error in addition.

(2) Visual C 6 used the Floating-point processor X87 (FPU) for arithmetic operations. The FPU calculates with internal 80-bits precision.  Visual C 2013 calculates everything using the XMM registers and SIMD commands, with (at best) 52-bits precision. Not-vector commands such as a=b+c are executed as 'scalar SIMD operations'.  

This trivial difference (FPU versus SIMD) makes a HUGE difference in accuracy. To compare different compilers, one should state, first and foremost, which mode is used - FPU or SIMD.  

(3) The Nvidia documentation mentioned is more of a Cipher then a Document.  The reader should try this.  (chapter 2.3). 

For x = 1.0008 , the correct mathematical result is x 2  1 = 1.60064 × 10  4 . The closest number using only four digits after the decimal point is 1.6006× 10  4 . In this case rn ( x 2  1 ) = 1.6006 × 10  4 which corresponds to the fused multiply-add operation rn ( x × x + (  1 ) ) . The alternative is to compute separate multiply and add steps. For the multiply, x 2 = 1.00160064 , so rn ( x 2 ) = 1.0016 . The final result is rn ( rn ( x 2 )  1 ) = 1.6000 × 10 4 .

I stopped reading there.
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
>>...So your table of Cos() - errors actually measured the error in addition... No, it does not. It is a matter of scientific respect to verify someone's result before making a comment. Did you verify these examples with, for example, a MATHLAB or a Windows calculator?
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Computer System used for the investigation ] ** Dell Dimension 4400 ** Intel Pentium 4 ( 1.60 GHz / 1 core ) 1GB RAM Seagate 20GB HDD ( * ) Seagate 3TB HDD ( ** ) EVGA GeForce 6200 Video Card 512MB DDR2 AGP 8x Video Card Windows XP Professional 32-bit SP3 Size of L2 Cache = 256KB Size of L1 Cache = 8KB Display resolution: 1440 x 990 ( * ) Seagate Barracuda 20GB IDE Hard Disk Drive ST320011A 3.5" 7200 Rpm 2MB Cache IDE Ultra ATA100 / ATA-iV/6 Average Rotational Latency : 4.17 ms Average Seek Times Read : 9.0ms Average Seek Times Write : 10.0ms Maximum Internal Transfer Rate : 69.4MB/sec Average External Transfer Rate : 100MB/sec ( Read and Write ) Maximum External Transfer Rate : 150MB/sec ( Read ) Note: Barracuda ATA IV Family ( ** ) Seagate Barracuda 3TB IDE Hard Disk Drive ST3000DM001 3.5" 7200 Rpm 64MB Cache SATA III ( 6GB/sec ) Average Rotational Latency : 4.16 ms Average Seek Times Read : 8.5ms Average Seek Times Write : 9.5ms Maximum Internal Transfer Rate : 268MB/sec Average External Transfer Rate : 156MB/sec ( Read and Write ) Maximum External Transfer Rate : 210MB/sec ( Read )
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Epsilon values of Floating Point Data Types ] Every C++ compiler defines three Epsilon values in float.h header file and this is how they look like:
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Microsoft C++ compiler ( VS2005 PE ) 32-bit - Release ] ... _RTFLT_EPSILON = 0.00000011920928955078125000000000000 _RTDBL_EPSILON = 0.00000000000000022204460492503131000 _RTLDBL_EPSILON = 0.00000000000000000010842021724855044 ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Borland C++ compiler v5.5.1 32-bit - Release ] ... _RTFLT_EPSILON = 0.00000011920928955078126080000000000 _RTDBL_EPSILON = 0.00000000000000022204460492503129600 _RTLDBL_EPSILON = 0.00000000000000000010842021724855044 ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Intel C++ compiler v12.1.7 ( u371 ) 32-bit - Release ] ... _RTFLT_EPSILON = 0.00000011920928955078125000000000000 _RTDBL_EPSILON = 0.00000000000000022204460492503131000 _RTLDBL_EPSILON = 0.00000000000000000010842021724855044 ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ MinGW C++ compiler v5.1.0 32-bit - Release ] ... _RTFLT_EPSILON = 0.00000011920928955078125000000000000 _RTDBL_EPSILON = 0.00000000000000022204460492503131000 _RTLDBL_EPSILON = 0.00000000000000000010842021724855044 ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ Watcom C++ compiler v2.0.0 32-bit - Release ] ... _RTFLT_EPSILON = 0.00000011920929000000001197000000000 _RTDBL_EPSILON = 0.00000000000000022204460492503130808 _RTLDBL_EPSILON = 0.00000000000000000010842021724855044 ...
0 Kudos
SergeyKostrov
Valued Contributor II
2,321 Views
[ These are results from different trigonometric functions ] 45 degrees is a test value. A true mathematical value of TAN( 45 deg ) is 1.0 and different functions calculated it with different accuracy. Borland and Watcom C++ compilers calculated TAN( 45 deg ) as 1.0 in 3 out of 4 tests. All the rest modern C++ compilers calculated TAN( 45 deg ) as 1.0 in 2 out of 4 tests.
0 Kudos
SergeyKostrov
Valued Contributor II
2,295 Views
[ Microsoft C++ compiler ( VS2005 PE ) 32-bit - Release ]

 ...
 CRT                   CrtSin(  45.00 ) =  0.7071067690849304200000      Completed in    0 ticks
 Normalized TS    msF.SinNTS7(  45.00 ) =  0.7071064710617065400000      Completed in    0 ticks
 Normalized TS    msF.SinNTS9(  45.00 ) =  0.7071068286895752000000      Completed in    0 ticks
 Normalized TS   msF.SinNTS11(  45.00 ) =  0.7071067690849304200000      Completed in    0 ticks
 TMathSet Sine Methods - RTfloat - Passed
 CRT                   CrtSin(  45.00 ) =  0.7071067811865474600000      Completed in    0 ticks
 Normalized TS    msD.SinNTS7(  45.00 ) =  0.7071064695751780900000      Completed in    0 ticks
 Normalized TS    msD.SinNTS9(  45.00 ) =  0.7071067829368671300000      Completed in    0 ticks
 Normalized TS   msD.SinNTS11(  45.00 ) =  0.7071067811796194500000      Completed in    0 ticks
 TMathSet Sine Methods - RTdouble - Passed
 CRT                   CrtCos(  45.00 ) =  0.7071067690849304200000      Completed in    0 ticks
 Normalized TS    msF.CosNTS7(  45.00 ) =  0.7071031928062439000000      Completed in    0 ticks
 Normalized TS    msF.CosNTS9(  45.00 ) =  0.7071068286895752000000      Completed in    0 ticks
 Normalized TS   msF.CosNTS11(  45.00 ) =  0.7071067094802856400000      Completed in    0 ticks
 TMathSet Cosine Methods - RTfloat - Passed
 CRT                   CrtCos(  45.00 ) =  0.7071067811865475700000      Completed in    0 ticks
 Normalized TS    msD.CosNTS7(  45.00 ) =  0.7071032148228456600000      Completed in    0 ticks
 Normalized TS    msD.CosNTS9(  45.00 ) =  0.7071068056832943100000      Completed in    0 ticks
 Normalized TS   msD.CosNTS11(  45.00 ) =  0.7071067810719247100000      Completed in    0 ticks
 TMathSet Cosine Methods - RTdouble - Passed
 CRT                   CrtTan(  45.00 ) =  1.0000000000000000000000      Completed in    0 ticks
 Normalized TS    msF.TanNTS7(  45.00 ) =  1.0000046491622925000000      Completed in    0 ticks
 Normalized TS    msF.TanNTS9(  45.00 ) =  1.0000000000000000000000      Completed in    0 ticks
 Normalized TS   msF.TanNTS11(  45.00 ) =  1.0000001192092896000000      Completed in    0 ticks
 TMathSet Tangent Methods - RTfloat - Passed
 CRT                   CrtTan(  45.00 ) =  0.9999999999999998900000      Completed in    0 ticks
 Normalized TS    msD.TanNTS7(  45.00 ) =  1.0000046029381060000000      Completed in    0 ticks
 Normalized TS    msD.TanNTS9(  45.00 ) =  0.9999999678316953100000      Completed in    0 ticks
 Normalized TS   msD.TanNTS11(  45.00 ) =  1.0000000001523033000000      Completed in    0 ticks
 TMathSet Tangent Methods - RTdouble - Passed
 ...

 

0 Kudos
SergeyKostrov
Valued Contributor II
2,295 Views
[ Borland C++ compiler v5.5.1 32-bit - Release ]

 ...
 CRT                   CrtSin(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 Normalized TS    msF.SinNTS7(  45.00 ) =  0.7071064710617065472000      Completed in    0 ticks
 Normalized TS    msF.SinNTS9(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 Normalized TS   msF.SinNTS11(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 TMathSet Sine Methods - RTfloat - Passed
 CRT                   CrtSin(  45.00 ) =  0.7071067811865475072000      Completed in    0 ticks
 Normalized TS    msD.SinNTS7(  45.00 ) =  0.7071064695751781376000      Completed in    0 ticks
 Normalized TS    msD.SinNTS9(  45.00 ) =  0.7071067829368671232000      Completed in    0 ticks
 Normalized TS   msD.SinNTS11(  45.00 ) =  0.7071067811796194304000      Completed in    0 ticks
 TMathSet Sine Methods - RTdouble - Passed
 CRT                   CrtCos(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 Normalized TS    msF.CosNTS7(  45.00 ) =  0.7071031928062439424000      Completed in    0 ticks
 Normalized TS    msF.CosNTS9(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 Normalized TS   msF.CosNTS11(  45.00 ) =  0.7071067690849304576000      Completed in    0 ticks
 TMathSet Cosine Methods - RTfloat - Passed
 CRT                   CrtCos(  45.00 ) =  0.7071067811865476096000      Completed in    0 ticks
 Normalized TS    msD.CosNTS7(  45.00 ) =  0.7071032148228456448000      Completed in    0 ticks
 Normalized TS    msD.CosNTS9(  45.00 ) =  0.7071068056832943104000      Completed in    0 ticks
 Normalized TS   msD.CosNTS11(  45.00 ) =  0.7071067810719247360000      Completed in    0 ticks
 TMathSet Cosine Methods - RTdouble - Passed
 CRT                   CrtTan(  45.00 ) =  1.0000000000000000000000      Completed in    0 ticks
 Normalized TS    msF.TanNTS7(  45.00 ) =  1.0000046491622924288000      Completed in    0 ticks
 Normalized TS    msF.TanNTS9(  45.00 ) =  1.0000000000000000000000      Completed in    0 ticks
 Normalized TS   msF.TanNTS11(  45.00 ) =  1.0000000000000000000000      Completed in    0 ticks
 TMathSet Tangent Methods - RTfloat - Passed
 CRT                   CrtTan(  45.00 ) =  0.9999999999999997952000      Completed in    0 ticks
 Normalized TS    msD.TanNTS7(  45.00 ) =  1.0000046029381060608000      Completed in    0 ticks
 Normalized TS    msD.TanNTS9(  45.00 ) =  0.9999999678316953600000      Completed in    0 ticks
 Normalized TS   msD.TanNTS11(  45.00 ) =  1.0000000001523032064000      Completed in    0 ticks
 TMathSet Tangent Methods - RTdouble - Passed
 ...

 

0 Kudos
Reply