bug

glockner · ‎03-29-2005

Hi,
I have found a problem with basic addition operation on double precision number with compiler 8.1.25 on Linux, when the program is optimized or not. It works well on anothers compilers (COMPAQ on Windows, SGI f90, HP-UX f90, ifort on IA64, f90 on True64).

The program is very simple and does not nead any comments.
If you put in a variable named V1 1.D0, in V2 again 1.D0, and you make the operation V1-V2+1.D-20, you find 2 different results whatever you optimize or not :

V1-V2+1.D-20 = 9.999999999999999E-021 (correct ; -O0)
V1-V2+1.D-20 = 0.000000000000000E+000 (optimized)

If you replace 1.D-20 by 1.D-14 :

V1-V2+1.D-14 = 1.000000000000000E-014 (correct ; -O0)
V1-V2+1.D-14 = 9.992007221626409E-015 (optimized)

With 1.D-20, I understand that V2+1.D-20 is the first operation made, but I think that a compiler should do this type of double operation.
With 1.D-14, I cannot 'explain' the result, whatever the way the compiler makes the operation, the result shoud be 1.D-14.

Moreover, if you put 1.D-20 or 1.D-14 in a variable V3, it works !

Here is the complete program :

program test
double precision V1,V2,V3,V4

V1=1.D0
V2=1.D0
V3=1.D-20
V4=1.D-14
print*,'V1 =',V1
print*,'V2 =',V2
print*,'V3 =',V3
print*,'V4 =',V4
print*,'V1-V2+V3 =',(V1-V2+V3)
print*,'V1-V2+1.D-20 =',V1-V2+1.D-20
print*,'1.D0-1.D0+1.D-20 =',1.D0-1.D0+1.D-20
print*,'V1-V2+1.D-14 =',V1-V2+1.D-14
print*,'(V1-V2)+V4 =',V1-V2+V4
end

Wihtout optimiziation (-O0) it works :
V1 = 1.00000000000000
V2 = 1.00000000000000
V3 = 9.999999999999999E-021
V4 = 1.000000000000000E-014
V1-V2+V3 = 9.999999999999999E-021
V1-V2+1.D-20 = 9.999999999999999E-021
1.D0-1.D0+1.D-20 = 9.999999999999999E-021
V1-V2+1.D-14 = 1.000000000000000E-014
(V1-V2)+V4 = 1.000000000000000E-014

But with optimazition -O, -O2 it does not work :
V1 = 1.00000000000000
V2 = 1.00000000000000
V3 = 9.999999999999999E-021
V4 = 1.000000000000000E-014
V1-V2+V3 = 9.999999999999999E-021
V1-V2+1.D-20 = 0.000000000000000E+000
1.D0-1.D0+1.D-20 = 9.999999999999999E-021
V1-V2+1.D-14 = 9.992007221626409E-015
(V1-V2)+V4 = 1.000000000000000E-014

The diffence between both are :

V1-V2+1.D-20 = 9.999999999999999E-021 (correct ; -O0)
V1-V2+1.D-20 = 0.000000000000000E+000

and
V1-V2+1.D-14 = 1.000000000000000E-014 (correct ; -O0)
V1-V2+1.D-14 = 9.992007221626409E-015

Any comments ?
Thanks
Stphane Glockner

PS : program has been compiled on suse9.2 (linux kernel 2.6) or RedHat 9.0 (linux kernel 2.4). Same results with ifc 7.0. On IA64 it's ok.

Steven_L_Intel1 · ‎03-29-2005

I don't think this is a bug. You are seeing the effects of temporary expression values being kept in the extended precision registers and not rounded to declared precision.

glockner · ‎03-29-2005

I am not sure to clearly understand. Do you mean that this due to the 'print' command ?

In the original program (more complex), the operation V1-V2+1.D-20 is at the denominator of a division and produces NaN (because of division by 0.D0 instead of 1.D-20).

Try to you write
V4=1.D0/(V1-V2+1.D-20)
print*,V4

The result is Infinity

With -O0, the result is 1.D-20

Not normal, no ?

Moereover, if you compile with -fpe0 option you have expected following error :

error :
forrtl: error (73): floating divide by zero
Image PC Routine Line Source
a.out 0804A586 Unknown Unknown Unknown
a.out 0804A1FD Unknown Unknown Unknown
Unknown 4009B500 Unknown Unknown Unknown
a.out 0804A091 Unknown Unknown Unknown

Thanks
Stphane

Steven_L_Intel1 · ‎03-29-2005

The compiler has a number of choices it can make about how to evaluate an expression. When optimizing on IA-32, it may choose to leave an intermediate expression in the extended-precision FP registers longer, giving slightly different results than you would see if it always stored back to the variable. Other architectures don't have these extended precision registers.

I have not analyzed your program in detail to see exactly what is happening, but everything you have said so far points to this being the explanation.

Please also be aware that these values you are using, such as 1.0D-20, cannot be exactly represented in binary floating point. Depending on which way rounding goes, you may see different results.

glockner · ‎03-29-2005

OK, I understand better, but results of Intel compiler are strange for me. I would even say it is unusual - dangerous :)) -.
It is really disturbing to have to pay attention to this kind of operation, all the more till now I works well on several other compilers (to the previous sited compilers, I have tried cygwin g77, linux g77 or g95)
I understand that operation with variable is handle differently than if one use directly the content of the variable (1.D-20). But justly, what I call may be not properly a bug, is it not here ?

I would appreciate if you could look more precisely this following simple program. I have add a test with 1.D-14, that is within the range of double precision.

program test
double precision V1,V2,V3,V4

V1=1.D0
V2=1.D0
V3=1.D-14
V4=1.D0/(V1-V2+1.D-14)
print*,V4 !=> 100079991719344. !!!! V2+1.D-14 is made first,
!!!! as if operations are made
!!!! from right to left. But
!!!! if you put 1.D-14+V1-V2
!!!! result is the same. I
!!!! thought that compilers
!!!! operation from left to
!!!! right
V4=1.D0/(V1-V2+V3)
print*,V4 !=> 100000000000000. ok

V1=1.D0
V2=1.D0
V3=1.D-20
V4=1.D0/(V1-V2+1.D-20)
print*,V4 !=> Infinity
V4=1.D0/(V1-V2+V3)
print*,V4 !=> 1.D20

end

You said that 1.0D-20 cannot be exactly represented in binary floating point. ok. For instance if PI=4.D0*ATAN(1.D0) and you make PI+1.D-20, it will not change anything. But if you make the operation 1.D-20+1.D-20, you should have 2.D-20 exactly, no ?

Thanks for your help
Stphane Glockner

Steven_L_Intel1 · ‎03-30-2005

If you believe there is a bug, please report it to Intel Premier Support at http://premier.intel.com/ This forum is not a formal support channel.