Solved: How to manage rounding by IVF ??

Hugo_Breuillard · ‎05-15-2012

Hello,

I've been using for a while Compaq Visual Fortran to run my code (on WinXP 32bits), but with a new computer I had to switch to IVF (on Win7 64 bits).

It seems that I have a rounding difference between CVF and IVF, since if I do a simple multipilcation between to REAL numbers, for example :

-1.388539*57.29578

I get from CVF : -79.55740
and from IVF : -79.55741

this rounding error is propagating through a lot of iterations and gives me big differences in results.

Can someone help me to fix this (I want IVF to give the same results as CVF, if possible) ? There should be some rounding parameter to set up, bu I'm pretty new to IVF unfortunately.

Thank you,
Hugo

Steven_L_Intel1 · ‎05-16-2012

By default, the Intel compiler uses SSE instructions for all floating point, which means you do all operations in the same precision that you declared. CVF does not use SSE instructions, it uses the old X87 floating point instructions which do all arithmetic in double precision, rounding down to single when a value is stored. For expressions with multiple operations, values can stay in double precision for a while and this can lead to differences when compared to doing these same operations in the declared precision.

/arch:ia32 tells the compiler to not use any SSE instructions, thus forcing it to use X87 instructions as well. This does not guarantee the exact same results, as anything that changes the order of operations can change results, but it gives you a better chance of reproducing CVF results.

My guess is that when you displayed the values, you got rounding and that one or both of the internal values were perhaps slightly less. Displaying the value in hex format (Z) is the only way to know exactly what you have.

View solution in original post

Steven_L_Intel1 · ‎05-15-2012

Most likely, the issue is that CVF used x87 instructions that give you "extra precision" above the datatype you declared. I note that the values you show have seven significant digits, which is at the border of single-precision.

However, I tried those particular numbers in both CVF and IVF and get the exact same result - but both values are different from what you show. Here's the code I used:

[plain]real :: x,y,z real(8) xx,yy,zz x = -1.388539 y = 57.29578 z = x * y print *, z xx = -1.388539D0 yy = 57.29578D0 zz = xx * yy print *, zz end[/plain] Both compilers give me:

-79.55743
-79.5574250654200

Even so, I would still bet on the "extra precision" being your issue. What you really should do is do such computations in double precision, if a difference at the seventh digit matters to you. In Visual Studio this is Fortran > Code Generation > Enable enhanced instruction sets > No enhanced instructions. From the command line that would be /arch:ia32.

Hugo_Breuillard · ‎05-16-2012

I see the extra precision thing, but how can I get -79.55740 then ?

It seems weird since the double precision itself gives -79.5574250654200, so I think I don't really get it.

Anyway, I tried with the code you just gave me and indeed I get the same results as you get. The thing is I use old and huge libraries to run my code so I cannot change the values to double precision inside these directly, and at the current stage, the values I obtain are indeed strange.

I'm not sure I understand the /arch:ia32 option : does it change every real number into double precision during calculation ?

I will try to check the values I obtain with other compilers to check if it is different, and I will keep you informed.

Thanks,

Steven_L_Intel1 · ‎05-16-2012

By default, the Intel compiler uses SSE instructions for all floating point, which means you do all operations in the same precision that you declared. CVF does not use SSE instructions, it uses the old X87 floating point instructions which do all arithmetic in double precision, rounding down to single when a value is stored. For expressions with multiple operations, values can stay in double precision for a while and this can lead to differences when compared to doing these same operations in the declared precision.

/arch:ia32 tells the compiler to not use any SSE instructions, thus forcing it to use X87 instructions as well. This does not guarantee the exact same results, as anything that changes the order of operations can change results, but it gives you a better chance of reproducing CVF results.

My guess is that when you displayed the values, you got rounding and that one or both of the internal values were perhaps slightly less. Displaying the value in hex format (Z) is the only way to know exactly what you have.

Hugo_Breuillard · ‎05-23-2012

Ok I get it now !

I tried by setting /arch:ia32 to all libraries and program and it worked thanks !

But I still have one question (sorry about that I'm quite curious) :

- do you trust more in results given by the double precision using SSE (-79.55743) or by the x87 instructions using single precision (-79.55740) ? because they are quite different and I don't really know which one I should trust more.

Anyway thank you very much for your help, very appreciated !

TimP · ‎05-23-2012

The difference you quote (apparently 1 ULP, or just in the low order bit), is to be expected if you compare a full double precision result against one which is partly single precision. If the differences are meaningful to you, you will need the full double precision, but the libraries might also need to be promoted before you could rely on those low order bits.

SergeyKostrov · ‎05-23-2012

Quoting TimP (Intel)

The difference you quote (apparently 1 ULP, or just in the low order bit)...

Please take a look at binary representations:

>> -1.388539 * 57.29578

Windows Calculator ( WC ): -79.55742506542

WCbin : 0xC053E3ACDA2E7CC6 = 1 10000000101 0011111000111010110011011010001011100111110011000110

Note: Value(base16) = SignExponent Mantissa

>> CVF: -79.55740

Most accurate representation = -7.95574000000000012278178473935E1

CVFbin: 0xC053E3AC710CB296 = 1 10000000101 0011111000111010110001110001000011001011001010010110

>> IVF: -79.55741

Most accurate representation = -7.9557410000000004401954356581E1

IVFbin : 0xC053E3AC9AFE1DA8 = 1 10000000101 0011111000111010110010011010111111100001110110101000

Let's compare all three binary representations:

WCbin : 0xC053E3ACDA2E7CC6 = 1 10000000101 0011111000111010110011011010001011100111110011000110
CVFbin: 0xC053E3AC710CB296 = 1 10000000101 0011111000111010110001110001000011001011001010010110
IVFbin : 0xC053E3AC9AFE1DA8 = 1 10000000101 0011111000111010110010011010111111100001110110101000

>> Do you trust more in results given by the double precision using SSE (-79.55743) or
>> by the x87 instructions using single precision (-79.55740)?
>>
>> SSE: -79.55743

Most accurate representation = -7.95574299999999965393726597540E1 Acceptable

>> x87: -79.55740

Most accurate representation = -7.95574000000000012278178473935E1 Not Acceptable

There is a significant difference between these two values.

jimdempseyatthecove · ‎05-24-2012

[fortran]real :: x,y,z real(8) xx,yy,zz x = -1.388539 y = 57.29578 z = x * y print *, z xx = -1.388539D0 yy = 57.29578D0 zz = xx * yy print *, zz end
[/fortran] using compiler generated literals:

-79.55743
-79.5574250654200

*** In release mode the compiler is making the computed values literals. As to if the compiler (at compile time) is using SSE/FPU is out of your control.

Recoding to force generated code to generate products.

[fortran]subroutine fooToo(x,y,xx,yy) real :: x,y,z real(8) xx,yy,zz z = x * y print *, z zz = xx * yy print *, zz end subroutine fooToo program foo real :: x,y real(8) xx,yy x = -1.388539 y = 57.29578 xx = -1.388539D0 yy = 57.29578D0 call fooToo(x,y,xx,yy) end program foo [/fortran] Compile the above with full optimizations and with inlining disabled.
Optimization set to force SSE2

SSE2:

-79.55743
-79.5574250654200

Optimization set to force FPU

-79.55743
-79.5574250654200

How you got your numbers, I cannot say.

Jim Dempsey

SergeyKostrov · ‎05-24-2012

Quoting jimdempseyatthecove

SSE2:

-79.55743
-79.5574250654200

Optimization set to force FPU

-79.55743
-79.5574250654200

How you got your numbers, I cannot say.

Jim Dempsey

I'd like to confirm Jim's numbers:

[cpp]Here is a proof that '-79.55743' is acceptable, or trusted in your terms, rounded value: For x87 FPU: x87 - _RTFPU_PC_24 - float : -79.5574264526367190 Rounded: -79.55743 (A) x87 - _RTFPU_PC_24 - double : -79.5574250654200060 Rounded: -79.55743 x87 - _RTFPU_PC_53 - double : -79.5574250654200060 Rounded: -79.55743 x87 - _RTFPU_PC_64 - double : -79.5574250654200060 Rounded: -79.55743 x87 - _RTFPU_PC_24 - long double: -79.5574250654200060 Rounded: -79.55743 x87 - _RTFPU_PC_53 - long double: -79.5574250654200060 Rounded: -79.55743 x87 - _RTFPU_PC_64 - long double: -79.5574250654200060 Rounded: -79.55743 For SSE2: SSE2 - Single-Precision : -79.5574264526367190 Rounded: -79.55743 (B) SSE2 - Double-Precision : -79.5574250654200060 Rounded: -79.55743 Take into account that Single-Precision results for x87 and SSE2 are the same ( see A and B ). [/cpp]

Best regards,
Sergey

SergeyKostrov · ‎05-24-2012

Here is my Test-Case:

[cpp] ... RTuint uiControlWordx87 = 0; RTfloat fResult = -1.388539f * 57.29578f; RTdouble dResult = -1.388539L * 57.29578L; RTlongdouble ldResult = -1.388539L * 57.29578L; // For x87 FPU: uiControlWordx87 = CrtControl87( _RTFPU_PC_24, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_24 - float : %3.16ftRounded: %3.5fn"), fResult, fResult ); CrtPrintf( RTU("n") ); uiControlWordx87 = CrtControl87( _RTFPU_PC_24, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_24 - double : %3.16ftRounded: %3.5fn"), dResult, dResult ); uiControlWordx87 = CrtControl87( _RTFPU_PC_53, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_53 - double : %3.16ftRounded: %3.5fn"), dResult, dResult ); uiControlWordx87 = CrtControl87( _RTFPU_PC_64, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_64 - double : %3.16ftRounded: %3.5fn"), dResult, dResult ); CrtPrintf( RTU("n") ); uiControlWordx87 = CrtControl87( _RTFPU_PC_24, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_24 - long double: %3.16ftRounded: %3.5fn"), ldResult, ldResult ); uiControlWordx87 = CrtControl87( _RTFPU_PC_53, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_53 - long double: %3.16ftRounded: %3.5fn"), ldResult, ldResult ); uiControlWordx87 = CrtControl87( _RTFPU_PC_64, _RTFPU_MCW_PC ); CrtPrintf( RTU("x87 - _RTFPU_PC_64 - long double: %3.16ftRounded: %3.5fn"), ldResult, ldResult ); CrtPrintf( RTU("n") ); // For SSE2: // Single-Precision __m128 a1 = { -1.388539f, 0.0f, 0.0f, 0.0f }; __m128 b1 = { 57.29578f, 0.0f, 0.0f, 0.0f }; __m128 r1 = _mm_mul_ps( a1 , b1 ); CrtPrintf( RTU("SSE2 - Single-Precision : %3.16ftRounded: %3.5fn"), r1.m128_f32[0], r1.m128_f32[0] ); // Double-Precision __m128d a2 = { -1.388539L, 0.0L }; __m128d b2 = { 57.29578L, 0.0L }; __m128d r2 = _mm_mul_pd( a2 , b2 ); CrtPrintf( RTU("SSE2 - Double-Precision : %3.16ftRounded: %3.5fn"), r2.m128d_f64[0], r2.m128d_f64[0] ); ... [/cpp]