- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello:
I was tracking down a map projection error the other day and came across a very undesirable behavor of DBLE(). Let me explain.
When given different 'seed' values, DBLE() throws in different 'junk' digits. What are the rules behind the generation of these digits? Why does DBLE() overshoot with one seed value (2.7183) but undershoot with another seed value (3.1416)?
Is there any way to enforce DBLE(x) to produce x as one entered as a REAL*8 constant? For example,
DBLE(3.1416) = 3.14160000000000
Steven
---
PROGRAM TEST
IMPLICIT NONE
REAL(4) :: xS
REAL(8) :: xD
xS = 3.1416
PRINT *, 'REAL*4 (xS) = ', xS
xD = 3.1416D0
PRINT *, 'REAL*8 entered as a constant (xD) = ', xD
PRINT *, 'REAL*8 converted from REAL*4 using DBLE(xS) = ', DBLE(xS)
PRINT *, 'Difference between converted REAL*8 and REAL*4 = ', DBLE(xS)-xD
xS = 2.7183
PRINT *, 'REAL*4 (xS) = ', xS
xD = 2.7183D0
PRINT *, 'REAL*8 entered as a constant (xD) = ', xD
PRINT *, 'REAL*8 converted from REAL*4 using DBLE(xS) = ', DBLE(xS)
PRINT *, 'Difference between converted REAL*8 and REAL*4 = ', DBLE(xS)-xD
END
---
REAL*4 (xS) = 3.141600
REAL*8 entered as a constant (xD) = 3.14160000000000
REAL*8 converted from REAL*4 using DBLE(xS) = 3.14159989356995
Difference between converted REAL*8 and REAL*4 = -1.064300536590679E-007
REAL*4 (xS) = 2.718300
REAL*8 entered as a constant (xD) = 2.71830000000000
REAL*8 converted from REAL*4 using DBLE(xS) = 2.71830010414124
Difference between converted REAL*8 and REAL*4 = 1.041412351909798E-007
Link Copied
4 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There are no "junk digits". DBLE converts the single-precision argument to double precision by adding binary zeroes to the fraction field. It doesn't know what decimal number you used originally. If you want a constant interpreted as REAL(8), maintaining the additional precision, add _8 (or D0) at the end.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve:
Thank you for your reply.
What I wanted is not a constant to be interpreted as REAL(8). Rather, I want the output of DBLE() to produce a clean stream of zeros, something like:
DBLE(3.1416) = 3.141600000 ...
Is this possible?
Steven
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your expectations indicate an inadequate understanding of binary floating point representations.
> I want the output of DBLE() to produce a clean stream of zeros..
That would be in violation of the Fortran language rules. Even if you could find a compiler that showed that kind of behavior, that compiler would be kaput.
Your "clean stream of zeros" in decimal floating point becomes a "filthy" (?) stream of 0 and 1 bits in binary. The 32-bit and 64-bit IEEE representations of 3.1416 are Z'40490FF9' and Z'400921FF2E48E8A7', neither of which contains a long stream of trailing zeros.
What is more, neither is an exact representation of 3.1416, because such a representation is impossible for this particular number.
If you wait a few years in the case of Fortran, or switch to another language, you may find Decimal Floating Point to be more widely implemented and closer to what you ask for. See
IEEE Standard for Floating-Point Arithmetic
> I want the output of DBLE() to produce a clean stream of zeros..
That would be in violation of the Fortran language rules. Even if you could find a compiler that showed that kind of behavior, that compiler would be kaput.
Your "clean stream of zeros" in decimal floating point becomes a "filthy" (?) stream of 0 and 1 bits in binary. The 32-bit and 64-bit IEEE representations of 3.1416 are Z'40490FF9' and Z'400921FF2E48E8A7', neither of which contains a long stream of trailing zeros.
What is more, neither is an exact representation of 3.1416, because such a representation is impossible for this particular number.
If you wait a few years in the case of Fortran, or switch to another language, you may find Decimal Floating Point to be more widely implemented and closer to what you ask for. See
IEEE Standard for Floating-Point Arithmetic
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This was discussed to death late last year. See http://software.intel.com/en-us/forums/showthread.php?t=101169. As mecej4 says, there is no single- or double-precision binary floating-point number which is equal to 3.141600000...

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page