Solved: Definition a double precision without extending the precision of literals.

JAlexiou · ‎06-03-2019

I understand that a single precision literal assigned to a double precision is going to results in some undefined bits. I am curious if those undefined bits are set by the compiler or at runtime based some memory location. I made sure the "Extend precision of sing;e-precision constants" was turned off in the code below.

Example code (compiled without /fpconstant as x86 with 19.0.1.104)

    program Console1
    implicit none
    ! Variables
    double precision :: x = 1.00000004
    double precision :: y = 1.00000008
    double precision :: z = 1.00000016
    
    integer :: i,n

    print *,x,y,z
  
    do i=1,8
        x = x * 2
        y = y * 2
        z = z * 2
        
        print *, x, y, z
    end do

    ! /fpconstant = off
    !
    !x                       y                       z    
    !1.00000000000000        1.00000011920929        1.00000011920929
    !2.00000000000000        2.00000023841858        2.00000023841858
    !4.00000000000000        4.00000047683716        4.00000047683716
    !8.00000000000000        8.00000095367432        8.00000095367432
    !16.0000000000000        16.0000019073486        16.0000019073486
    !32.0000000000000        32.0000038146973        32.0000038146973
    !64.0000000000000        64.0000076293945        64.0000076293945
    !128.000000000000        128.000015258789        128.000015258789
    !256.000000000000        256.000030517578        256.000030517578
        
    end program Console1

what is interesting to me is that a) the values of y and z match (so what I type beyond the single float precision does not matter) and that b) each run produces the exact same results. So where does 1.00000011920929 (hex: 3ff0000020000002) come from, how is deviation from ideal of 1.000000040000000 (hex: 3ff000000abcc771) decided by the compiler?

FortranFan · ‎06-03-2019

As suggested in Quote #2, the Fortran standard leaves it up to the compiler on the assignment of a literal constant of the default floating-point (REAL) kind to a variable of a different kind that may correspond to higher precision. You can expend considerable effort trying to understand what the compiler does "under the hood" but, as you may know and so might other readers, the recommended approach per the Fortran standard is to work with defined KINDs but stay away from 'magic' numbers such as 4 or 8. You can use the underscore suffix with literal constants. Or another standard-supported option will be to use the intrinsic function REAL for a verbose but perhaps more explicit and consistent conversion in the assignment.

   use, intrinsic :: iso_fortran_env, only : real_kinds

   implicit none

   integer, parameter :: P6 = selected_real_kind( p=6 )
   integer, parameter :: P12 = selected_real_kind( p=12 )
   real(kind=P6), parameter :: Y_P6 = 1.00000008_p6
   real(kind=P6), parameter :: Z_P6 = 1.00000016_p6
   character(len=*), parameter :: fmtg = "(*(g0,1x))"
   character(len=*), parameter :: fmth = "(g0,1x,z0)"

   real(kind=P12) :: y, z

   print fmtg, "REAL KINDS supported by this compiler: ", real_kinds
   print fmtg, "KIND corresponding to selected_real_kind( p=6 ): ", P6
   print fmtg, "and the precision it offers: ", precision(0.0_P6)
   print fmtg, "KIND corresponding to selected_real_kind( p=12 ): ", P12
   print fmtg, "and the precision it offers: ", precision(0.0_P12)

   print fmtg, "Y_P6 = ", Y_P6
   print fmth, "Y_P6(hex) = ", Y_P6
   print fmtg, "Z_P6 = ", Z_P6
   print fmth, "Z_P6(hex) = ", Z_P6

   y = Y_P6
   z = Z_P6
   print fmtg, "y = ", y
   print fmth, "y(hex) = ", y
   print fmtg, "z = ", z
   print fmth, "z(hex) = ", z

   y = 1.0_p12                   !<-- Approach 1
   z = real( 1.0, kind=kind(z) ) !<-- Approach 2
   print fmtg, "y = ", y
   print fmth, "y(hex) = ", y
   print fmtg, "z = ", z
   print fmth, "z(hex) = ", z

   stop

end

Upon execution,

REAL KINDS supported by this compiler:  4 8 16
KIND corresponding to selected_real_kind( p=6 ):  4
and the precision it offers:  6
KIND corresponding to selected_real_kind( p=12 ):  8
and the precision it offers:  15
Y_P6 =  1.000000
Y_P6(hex) =  3F800001
Z_P6 =  1.000000
Z_P6(hex) =  3F800001
y =  1.000000119209290
y(hex) =  3FF0000020000000
z =  1.000000119209290
z(hex) =  3FF0000020000000
y =  1.000000000000000
y(hex) =  3FF0000000000000
z =  1.000000000000000
z(hex) =  3FF0000000000000

View solution in original post

Steve_Lionel · ‎06-03-2019

Many decimal fractions don't have exact binary floating values, including the ones in your program. The single precision decimal values are converted to the closest single binary value, and then that is assigned to the double. Once the decimal is converted to binary, nothing "remembers" the nice extra decimal digits you provided.

Some interesting reading:

http://software.intel.com/en-us/forums/topic/275071#comment-1548438

http://software.intel.com/en-us/forums/topic/275071#comment-1548441

http://software.intel.com/en-us/forums/topic/275071#comment-1548445

The recommended way of handling this is something like this:

integer, parameter :: DP = SELECTED_REAL_KIND (15)
real(DP) :: x = 1.00000004_DP
real(DP) :: y = 1.00000008_DP
real(DP) :: z = 1.00000016_DP

With that change, I get:

   1.00000004000000        1.00000008000000        1.00000016000000
   2.00000008000000        2.00000016000000        2.00000032000000
   4.00000016000000        4.00000032000000        4.00000064000000
   8.00000032000000        8.00000064000000        8.00000128000000
   16.0000006400000        16.0000012800000        16.0000025600000
   32.0000012800000        32.0000025600000        32.0000051200000
   64.0000025600000        64.0000051200000        64.0000102400000
   128.000005120000        128.000010240000        128.000020480000
   256.000010240000        256.000020480000        256.000040960000

IanH · ‎06-03-2019

I understand that a single precision literal assigned to a double precision is going to results in some undefined bits.

No - the bits are not "undefined".

First, the single precision initialiser will have the "best" possible single precision value (as considered by the compiler) to the decimal literal that appears in the source code.

Then the double precision variable will have the best possible double precision value to that single precision value. Given the nature of the underlying implementation of double and single precision values (double precision extends the precision and range of single precision), the double and single numerical values will match exactly.

The best single precision value for 1.00000008 is 1.00000011920928955078125 (or 0X8.00001P-3 using EX format), so that is the also the numerical value given to the double precision constant. The best single precision value for 1.00000008 is the same - hence your observations.

Similarly the compiler selects 1 as the best single precision value for 1.00000004 - the bit pattern you quote would be the best double precision value for the decimal literal - but the literal in the source specifies a single precision value. If you want a double precision value, use a double precision literal.

FortranFan · ‎06-03-2019

As suggested in Quote #2, the Fortran standard leaves it up to the compiler on the assignment of a literal constant of the default floating-point (REAL) kind to a variable of a different kind that may correspond to higher precision. You can expend considerable effort trying to understand what the compiler does "under the hood" but, as you may know and so might other readers, the recommended approach per the Fortran standard is to work with defined KINDs but stay away from 'magic' numbers such as 4 or 8. You can use the underscore suffix with literal constants. Or another standard-supported option will be to use the intrinsic function REAL for a verbose but perhaps more explicit and consistent conversion in the assignment.

   use, intrinsic :: iso_fortran_env, only : real_kinds

   implicit none

   integer, parameter :: P6 = selected_real_kind( p=6 )
   integer, parameter :: P12 = selected_real_kind( p=12 )
   real(kind=P6), parameter :: Y_P6 = 1.00000008_p6
   real(kind=P6), parameter :: Z_P6 = 1.00000016_p6
   character(len=*), parameter :: fmtg = "(*(g0,1x))"
   character(len=*), parameter :: fmth = "(g0,1x,z0)"

   real(kind=P12) :: y, z

   print fmtg, "REAL KINDS supported by this compiler: ", real_kinds
   print fmtg, "KIND corresponding to selected_real_kind( p=6 ): ", P6
   print fmtg, "and the precision it offers: ", precision(0.0_P6)
   print fmtg, "KIND corresponding to selected_real_kind( p=12 ): ", P12
   print fmtg, "and the precision it offers: ", precision(0.0_P12)

   print fmtg, "Y_P6 = ", Y_P6
   print fmth, "Y_P6(hex) = ", Y_P6
   print fmtg, "Z_P6 = ", Z_P6
   print fmth, "Z_P6(hex) = ", Z_P6

   y = Y_P6
   z = Z_P6
   print fmtg, "y = ", y
   print fmth, "y(hex) = ", y
   print fmtg, "z = ", z
   print fmth, "z(hex) = ", z

   y = 1.0_p12                   !<-- Approach 1
   z = real( 1.0, kind=kind(z) ) !<-- Approach 2
   print fmtg, "y = ", y
   print fmth, "y(hex) = ", y
   print fmtg, "z = ", z
   print fmth, "z(hex) = ", z

   stop

end

Upon execution,

REAL KINDS supported by this compiler:  4 8 16
KIND corresponding to selected_real_kind( p=6 ):  4
and the precision it offers:  6
KIND corresponding to selected_real_kind( p=12 ):  8
and the precision it offers:  15
Y_P6 =  1.000000
Y_P6(hex) =  3F800001
Z_P6 =  1.000000
Z_P6(hex) =  3F800001
y =  1.000000119209290
y(hex) =  3FF0000020000000
z =  1.000000119209290
z(hex) =  3FF0000020000000
y =  1.000000000000000
y(hex) =  3FF0000000000000
z =  1.000000000000000
z(hex) =  3FF0000000000000