Roundoff during precision promotion of result

Dhoorjaty_P_ · ‎07-20-2016

I am using IVF version 16 on a 64-bit system. The following code, compiled with default options ("ifort <file>"), produces the answer shown further below:

      program roundoff
         implicit none
         integer, parameter :: sp = kind(1.0)
         integer, parameter :: dp = kind(1.d0)
         real(sp)           :: as, bs
         real(dp)           :: ad, bd, cd, dd
         character(len=100) :: f

         as = 1.E0
         bs = 1.E-8
         ad = dble(as)
         bd = dble(bs)
         cd = as + bs
         dd = ad + bd

         f = "(a,3x,f14.10)"
         write(10,f) "dble(as + bs) =        ", cd
         write(10,f) "dble(as) + dble(bs)  = ", dd
      end program roundoff

Output:

dble(as + bs) =             1.0000000000
dble(as) + dble(bs)  =      1.0000000100

Is there a way to avoid round-off in the result for "cd", short of defining the default real KIND (/real_size:64) for all variables?

Thanks, Pradeep.

mecej4 · ‎07-21-2016

The following is an explanation of how things worked in your program. On line-13, the expression on the right hand side is first evaluated. Then, the value is converted to the type of the variable on the left, if such conversion is necessary. The expression on the right is of type REAL. During the evaluation of the expression, the rounding error occurs. The type of the variable on the left has no influence on this evaluation.

With this rule in mind ("an expression has a type"), you can devise a means of avoiding the rounding error and accomplish your goals.

John_Campbell · ‎07-21-2016

You have to recognisee the precision rules of Fortran.

dble (as + bs) = dble (single+single) = dble (single), so the problem can only be overcome by:

dble(as) + dble(bs), or

as + dble(bs)

You may also like to consider the following example that shows the rules about single and double precision constants, which has been in force since Fortran 90 (or 77?). In earlier Fortrans, "ad" would have been given the value 0.1d0.

   real*8 ad, bd
   ad = 0.1
   bd = 0.1d0
   write (*,*) ad,bd
   end

It is the rule, but not always a helpful one.

TimP · ‎07-21-2016

The original f66 rules against expressions involving mixed precision were relaxed in all conceivable inconsistent ways in various compilers, including those which targeted 8087 and the like, where all expressions were promoted implicitly to double or double extended precision. ifort still carries options to change the treatment of constants for consistency with predecessor compilers, but Steve Lionel has recommended against those non-standard treatments. Current compilers generally adhere to Fortran standards, as you saw.

I think the idea of trying to promote single precision constants by options such as /real-size has annoyances as well (e.g. is 0.1e-8_real32 distinguished from 0.1e-8)?

Note that the x87 treatment (supported by ifort only for ia32 mode) may still vary with default setup between 32- and 64-bit Windows, and between Windows and linux.

Dhoorjaty_P_ · ‎07-21-2016

Thanks, Tim, John, mecej4.

Reading further on this, my understanding is that the standard places no requirements on compilers as to the accuracy of the answer they produce for "cd", except that the evaluation of the RHS expression can proceed independently of the context (i.e. the type of the LHS variable).

FWIW, gfortran on the same machine produces the identical values for cd and dd.

mecej4 · ‎07-21-2016

You may find it instructive to examine the machine instructions generated for line-13 of #1. Use the /Fa /c /Od options (/Od disables optimization; without it, for your example code the entire calculation can be performed at compile time, so there may be no machine instructions to see!).

...
        movss     xmm0, DWORD PTR [rbp]                         ;13.4
        movss     xmm1, DWORD PTR [4+rbp]                       ;13.4
        addss     xmm0, xmm1                                    ;13.12
        cvtss2sd  xmm0, xmm0                                    ;13.4

The instructions show the two 32-bit operands being loaded into xmm0 and xmm1, followed by a single-precision addition. Since the value in xmm1 is less in magnitude than that in xmm0 X ε_m, the addss leaves xmm0 unchanged. The subsequent cvtss2sd converts from REAL to DOUBLE PRECISION by adding zero bits at the low order end. After each of the instructions shown above, xmm0 just contains 1.0.

jimdempseyatthecove · ‎07-21-2016

The fact that the lhs is dp and the rhs is sp, in Fortran, has no bearing on if calculations should be promoted. The programmer may indeed require sp results on the rhs, then, subsequent to expression evaluation, to be converted to dp.

The Fortran standards requires (to some degree) the expressions are to be (equivalently) performed as told.

This said, the instruction set used, the older FPU (emulated 8087) as used on older compilers and older CPUs or SSE (AVX, AVX2, AVX512, ...) will produce different results for your example program. This is due to FPU implicit promotion of sp to dp (or tp) internally during expression evaluation. With the final result being (optionally) converted back to the lhs format. The SIMD instructions do not have a similar capability (with arguable exception with the FMA - Fused Multiply and Add).

You are welcome to compile targeting IA32 FPU (using compilers that still support this) at the expense of performance gained through vectorizaton.

Jim Dempsey