Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
05:44 PM

391 Views

Double to Single

I have read the many posts that cover the topic of getting rid of the extra digits when converting a single to a double, and I think I understand the issues, but at the risk of releasing a firestorm of "this topic has already been done to death" replies can I just ask this simple question?

If I have Single = 1.4726, when I set Double = Single, I get Double = 1.472599983215332. I understand from previous posts that the extra digits are because the double precision bit pattern can't exactly represent 1.472600000000000 and 1.472599983215332 is the closest approximation. What I don't understand is if I set the Double directly to 1.4726d0 then I get exactly 1.472600000000000, which sort of contradicts the previous sentence. Can someone please explain this?

1 Solution

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2021
11:02 PM

136 Views

Link Copied

20 Replies

DavidWhite

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
05:53 PM

382 Views

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
06:10 PM

375 Views

FortranFan

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
07:22 PM

363 Views

First, please note yours is a general Fortran inquiry for which you may want to also consider the Fortran Discourse for wider Fortran community feedback: https://fortran-lang.discourse.group/

Secondly, can you please share from where do you get a value as '1.4726`? Is that a calculation/simulation/experimental result stored in a file or database that is read in? If so, please see simple-minded code below that "mimics" such an action:

```
integer, parameter :: SP = selected_real_kind( p=6 )
integer, parameter :: DP = selected_real_kind( p=12 )
character(len=*), parameter :: fmtg = "(*(g0))"
character(len=*), parameter :: fmth = "(g0,z0)"
character(len=:), allocatable :: val
real(SP) :: val_sp
real(DP) :: val_dp
val = "1.4726" !<-- Assume this represents a file read or database fetch
read( val, fmt=* ) val_sp
print fmtg, "value: ", val_sp
print fmth, "value (single, hex): ", val_sp
read( val, fmt=* ) val_dp
print fmtg, "value: ", val_dp
print fmth, "value (double, hex): ", val_dp
end
```

Intel Fortran compiler toward a program would give:

```
C:\Temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 14.28.29337.0
Copyright (C) Microsoft Corporation. All rights reserved.
-out:p.exe
-subsystem:console
p.obj
C:\Temp>p.exe
value: 1.472600
value (single, hex): 3FBC7E28
value: 1.472600000000000
value (double, hex): 3FF78FC504816F00
C:\Temp>
```

where you can see the difference in bit representation even as the "apparent" value seems to the same. You can then see the type conversion you seek will result in misplaced bits.

Thus you are better off addressing the definition of your variables of different precision - "single" vs "double" - at the level of your "data" and thereby avoid type conversions.

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:03 PM

357 Views

FortranFan

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:14 PM

352 Views

DavidWhite

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:15 PM

352 Views

Ulrich_M_

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
05:19 AM

277 Views

*does* set the extra significant digits to zero--it's just the base 2 (or equivalenlty base 16=hexadecimal) digits that are set to zero, and not the base 10 ones. Setting the base 2 digits to zero in the conversion, rather than the base 10 ones, is the best thing to do, since numbers are internally represented in a base 2 system, so one obtains a (slightly) better approximation this way.

The only reason to favor base 10 arises* if you know* that the true number had all extra digits zero in base 10. I cannot think of many applications where this would be the case.

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:30 PM

346 Views

DavidWhite

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:33 PM

343 Views

That's changing the story now. Not sure you said it was already a binary file.

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
08:42 PM

338 Views

Single=1.4726

write(STR,*) Single

read(STR,*) Double

It seems to work. Can you see any issues with this approach?

Arjen_Markus

Valued Contributor III

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-28-2021
11:53 PM

314 Views

Yes, the point is that you may *think* the number is 1.4726, but the bit pattern will say otherwise. So, when converted to a decimal representation with only 4 decimals, it may look exactly what you hope it is, but written out with 5 decimals, it could easily turn into 1.47259 instead of 1.47260. The point is, whatever the bit pattern in the file, the number it represents is *at most *the single-precision floating-point number closest to 1.47260000...

The only way out is to use a decimal representation of the number instead of a binary one. Because a decimal representation (such as BCD) will work in the way we humans are generally using arithmetic - in a decimal system. The situation is not different in binary representation than in decimal representation: in neither case you can exactly represent 1/3 with a finite number of "decimals". It is unfortunately, however, that the set of rational numbers that can be represented exactly in finite precision is much smaller in binary representation than it is in decimal representation.

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
01:05 AM

299 Views

The OP has a conceptual problem, not recognizing that reals and doubles are represented and manipulated in a base 2 system (IEEE standard).

The following program may help overcome the mental block.

```
program hello
real*8 a,b
a = 0.1
b = 10*a - 1.0
print *,b
end program Hello
```

The printed answer need not be zero, depending on the computer and compiler used. I tried Gfortran on a cloud service, and it gave

```
$gfortran -std=gnu *.f95 -o main
$main
1.4901161193847656E-008
```

Some hand-held calculators implemented decimal arithmetic. There have been decimal arithmetic packages proposed for Fortran. The following can be used to test whether a processor (calculator or computer) is decimal or not.

```
a = 4d0
b = 3d0
print *,3*(a/b-1)-1
end program
```

On a calculator, "(4/3-1)*3-1" or "4 Enter 3 / 1 - 3 * 1 -"

cryptogram

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
04:50 AM

286 Views

In fact, the old Microsoft 3.31 Fortran compiler from many years ago did all of it's math using subroutine libraries. You could choose from

1) Normal math library, assumed that math coprocessor installed.

2) Normal math library, would work with coprocessor if available, but would still work if not.

3) Decimal math library.

avinashs

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
12:26 PM

256 Views

I share a similar concern as @schulzey regarding Fortran code. To guarantee an exact representation of a number in double precision requires the programmer to append d0 or _dp if dp has been defined to be real(kind = 8). In other languages such as C/C++ or VB this is not required, where the default is double precision to begin with.

One example from chemical engineering applications is the case of atomic weights. The accepted atomic weight of C in our calculations is 12.01115. However, I have to assign it in Fortran as

AWC = 12.01115d0

to guarantee that it is represented to the same significant figures as above whereas in C++ I can simply code AWC = 12.01115

Similarly, if AWC is read from an ASCII file automatically generated by Excel where it is stored as 12.0115, then extra digits may appear after 8 significant digits as numerical noise ex. 12.0111500003254. This noise later propagates through thousands of calculations in large codes.

I have been recently using the IVF option /real-size:64 and that seems to not require the d0 or _dp. Further, when writing to files, I use write(*,'(g0)') AWC to ensure exact representation.

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
01:22 PM

244 Views

@avinashs wrote:

I share a similar concern as @schulzey regarding Fortran code. To guarantee an exact representation of a number in double precision requires the programmer to append d0 or _dp if dp has been defined to be real(kind = 8). In other languages such as C/C++ or VB this is not required, where the default is double precision to begin with.

This is not true - even specifying the kind doesn't "guarantee an exact representation". Most decimal fractions are not exactly representable in binary floating point. You're only kidding yourself if you believe that double precision solves everything.

FortranFan

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
01:58 PM

238 Views

```
@avinashs wrote:
.. if AWC is read from an ASCII file automatically generated by Excel where it is stored as 12.0115, then extra digits may appear after 8 significant digits as numerical noise ex. 12.0111500003254. This noise later propagates through thousands of calculations in large codes. ..
```

What you posted does not appear to be accurate. Can you please show what you mean while keeping the following in mind?

```
integer, parameter :: SP = selected_real_kind( p=6 )
integer, parameter :: DP = selected_real_kind( p=12 )
character(len=*), parameter :: fmtg = "(*(g0))"
character(len=*), parameter :: fmth = "(g0,z0)"
integer :: lun
real(SP) :: val_sp
real(DP) :: val_dp
open( newunit=lun, file="atomic_mass.txt" )
read( lun, fmt=* ) val_sp
print fmtg, "value: ", val_sp
print fmth, "value (single, hex): ", val_sp
rewind( lun )
read( lun, fmt=* ) val_dp
print fmtg, "value: ", val_dp
print fmth, "value (double, hex): ", val_dp
end
```

```
C:\Temp>type atomic_mass.txt
12.01115
C:\Temp>ifort /standard-semantics p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000
Copyright (C) 1985-2021 Intel Corporation. All rights reserved.
Microsoft (R) Incremental Linker Version 14.28.29337.0
Copyright (C) Microsoft Corporation. All rights reserved.
-out:p.exe
-subsystem:console
p.obj
C:\Temp>p.exe
value: 12.01115
value (single, hex): 41402DAC
value: 12.01115000000000
value (double, hex): 402805B573EAB368
C:\Temp>
```

avinashs

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-29-2021
12:29 PM

256 Views

I share a similar concern as @schulzey regarding Fortran code. To guarantee an exact representation of a number in double precision requires the programmer to append d0 or _dp if dp has been defined to be real(kind = 8). In other languages such as C/C++ or VB this is not required, where the default is double precision to begin with.

One example from chemical engineering applications is the case of atomic weights. The accepted atomic weight of C in our calculations is 12.01115. However, I have to assign it in Fortran as

AWC = 12.01115d0

to guarantee that it is represented to the same significant figures as above whereas in C++ I can simply code AWC = 12.01115

Similarly, if AWC is read from an ASCII file automatically generated by Excel where it is stored as 12.0115, then extra digits may appear after 8 significant digits as numerical noise ex. 12.0111500003254. This noise later propagates through thousands of calculations in large codes.

I have been recently using the IVF option /real-size:64 and that seems to not require the d0 or _dp. Further, when writing to files, I use write(*,'(g0)') AWC to ensure exact representation.

mecej4

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-30-2021
03:51 AM

212 Views

Avinash, your statement "To guarantee an exact representation of a number in double precision requires the programmer to append d0" hints at the existence of a misunderstanding of binary floating point arithmetic. It may "require", but it is by no means sufficient. Many simple decimal numbers such as (1/10) = 0.1 do not have an exact representation in binary, just as 1/3 does not have an exact and short representation in decimal. Please try the following programs.

```
program tenth
real a,b
a = 0.1
b = 0.01
print *,a*a-b
end
```

and the double precision version

```
program tenth
real*8 a,b
a = 0.1d0
b = 0.01d0
print *,a*a-b
end
```

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

04-30-2021
05:00 AM

201 Views

>>If I have Single = 1.4726, when I set Double = Single, I get Double = 1.472599983215332.

FWIW The printout (or debugger view) of the SP variable is a decimal approximation of the (binary) internally stored variable. IOW the value in SP is approximately 1.4726

When copied from SP to DP, the 8-bit SP exponent is copied (0-extended) to the 11-bit DP exponent and the 23-bit SP mantissa (holding the approximate value mantissa of 1.4726) is copied to the 52-bit DP mantissa (0-filled in remainder).

The binary values are exactly the same approximation of 1.4726 as was held in the SP variable,... however when printed, you now see the difference in the approximation as was held in the SP variable.

Fixing the DP value to 1.4726d0 = (~1.472599983215332 + ~0.000000016784668) ...

Then should you copy this (fixed) DP value back to an SP variable, the "fixed" value (~0.000000016784668) would get truncated (rounded off) and lost. IOW the ~0.000000016784668 is *the approximation of the error in the SP variable* and not representative of an error in the DP copy of the SP variable.

This is a common mental block that (new) programmers experience in that they assume the value printed is an exact representation of the value of the stored variable (IOW an assumption that all variable have infinite decimal precision whereas the variables have finite binary precision). mecej4's 4/29 post was an alternate way of illustrating that the fractional precision between SP and DP can be visually significant when the fraction cannot be exactly represented in the SP binary mantissa.

If you want exact representation to 6 decimal places (e.g units of microns), then program in units of microns, not meters. But keep in mind that any generated fractional units could result in approximations that you would then have to determine how to handle.

Jim Dempsey

schulzey

New Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-02-2021
11:02 PM

137 Views

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.