- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Story goes something like this...
A part of the code loads and stores all numbers as DOUBLE. It reads the number 201 and stores it as 0x40691FFFFFFFFFFF. Somewhere down the code-path, this needs to get converted to INTEGER, as in J=DUM, where J is an integer, andDUM is the double. Disassembly looks like this...
0000000180006B40 fld qword ptr [DUM (181577670h)]
0000000180006B46 fstp qword ptr [rbp+0AC60h]
0000000180006B4C cvttsd2si rax,mmword ptr [rbp+0AC60h]
0000000180006B55 mov qword ptr [J (181577668h)],rax
Unfortunately, after this is executed, J is now 200, not 201. What a bummer, DUM should have been 0x4069200000000000.
What a bummer, now I get an array indexing error because J is 200 instead of 201.
So how do I make sure that all doubles are, I suppose, converted to "nearest" integer, rather than "truncated" integer? Or is this even something I can forceafter the original 0x40691FFFFFFFFFFF was read in? The compiler version is 11.0.061, should this make a difference.
Or is there a compiler flag that would automatically insert NINT() s instead of INT() s?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
However - the number 201 is exactly representable as a double. Is it REALLY 201 you're reading in or something else?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Story goes something like this...
A part of the code loads and stores all numbers as DOUBLE. It reads the number 201 and stores it as 0x40691FFFFFFFFFFF. Somewhere down the code-path, this needs to get converted to INTEGER, as in J=DUM, where J is an integer, andDUM is the double. Disassembly looks like this...
0000000180006B40 fld qword ptr [DUM (181577670h)]
0000000180006B46 fstp qword ptr [rbp+0AC60h]
0000000180006B4C cvttsd2si rax,mmword ptr [rbp+0AC60h]
0000000180006B55 mov qword ptr [J (181577668h)],rax
Unfortunately, after this is executed, J is now 200, not 201. What a bummer, DUM should have been 0x4069200000000000.
What a bummer, now I get an array indexing error because J is 200 instead of 201.
So how do I make sure that all doubles are, I suppose, converted to "nearest" integer, rather than "truncated" integer? Or is this even something I can forceafter the original 0x40691FFFFFFFFFFF was read in? The compiler version is 11.0.061, should this make a difference.
Or is there a compiler flag that would automatically insert NINT() s instead of INT() s?
0x40691FFFFFFFFFFF is not 201.0 it is as close as you can get to 201.
index = dint(rindex + 0.5_8)
where rindex approximates correct index within a few lsb bits in the DP FP format.
When reading an integernumber from a text file, read into an INTEGER, then store into the DP FP variable. Then as long as you do not perform division or combine with a non-int real*8 the internal number will remain exact.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Of course, the thingjusthappens to be
(2+1/(10.D0**2.0))*(10.D0**2.0)
And the problem magically goes away when I re-write it as
2*(10.D0**2.0)+1*(10.D0**(2.0-2.0))
(The problem is that 10^(-2) expands to an irrational binary number).
Thanks Steve!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That could work as well...I'll go throughmy code to see how easyit wouldbeto determine ahead of time whether I'm reading an integer or a real. Thanks Jim!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Of course, the thingjusthappens to be
(2+1/(10.D0**2.0))*(10.D0**2.0)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Of course, the thingjusthappens to be
(2+1/(10.D0**2.0))*(10.D0**2.0)
And the problem magically goes away when I re-write it as
2*(10.D0**2.0)+1*(10.D0**(2.0-2.0))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That could work as well...I'll go throughmy code to see how easyit wouldbeto determine ahead of time whether I'm reading an integer or a real. Thanks Jim!
You might find this better
Read from file into text variable
Scan text variable for if integer
When integer, use internal read from text variable into integer variable then convert to REAL variable
When not integer, use internal read direct into REAL variable
The test for integer has to be smart enough to read 123.00D+03 or -345.00 or 123.00D-1, etc..
(and maybe even 123.4D+1 as int)
Also the read to text can clean up the text to take care of the case where the input was prepared with a text editor and the text editor inserteg TAB into the file. READ tends to choke on this (and TAB is hard to see when trying to figure out the problem).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim excellent catch. Ill be sure to fix that one.
Jim If I test for integer/not integer, then I already have to keep most of my code; so why not use the existing one-line calc that combines the sign, real, decimal, and exponent parts? At least I know what to expect with this thing. I think Ill dig deeper with internal reads and TABs; one of the first things we do when debugging a customers input is to yell at them if they used TABs, because we told them not to.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Story goes something like this...
A part of the code loads and stores all numbers as DOUBLE. It reads the number 201 and stores it as 0x40691FFFFFFFFFFF. Somewhere down the code-path, this needs to get converted to INTEGER, as in J=DUM, where J is an integer, andDUM is the double. Disassembly looks like this...
0000000180006B40 fld qword ptr [DUM (181577670h)]
0000000180006B46 fstp qword ptr [rbp+0AC60h]
0000000180006B4C cvttsd2si rax,mmword ptr [rbp+0AC60h]
0000000180006B55 mov qword ptr [J (181577668h)],rax
Unfortunately, after this is executed, J is now 200, not 201. What a bummer, DUM should have been 0x4069200000000000.
What a bummer, now I get an array indexing error because J is 200 instead of 201.
So how do I make sure that all doubles are, I suppose, converted to "nearest" integer, rather than "truncated" integer? Or is this even something I can forceafter the original 0x40691FFFFFFFFFFF was read in? The compiler version is 11.0.061, should this make a difference.
Or is there a compiler flag that would automatically insert NINT() s instead of INT() s?
Hi.
Somehow, conversion from float or double to integer can be done with the functions lrintf & lrint. Unfortunately, these functions are missing in compilers due to controversies over the C99 standard.
An implementation of lrint is given below, the function rounds a FP number to the nearest integer. If two integers are equally near then the even integer is returned. There is no check for overflow. The function can be tried for 32-bit Linux & 32-bit windows -
static inline int lrint (double const x) { //round to nearest integer
int n;
#if defined (__unix__) || defined (__GNUC__)
// 32-bit Linux, GNU/AT&T syntax;
__asm("fldl %l n fistpl %0 " : "=n"(n) : "n"(x) : " "memory");
else
// 32-bit windows, Intel/MASM syntax
__asm fld qword ptr x;
__asm fistp dword ptr n;
#endif
return n:}
-----
The following example shows how to use the lrint function -
double d = 1.6;
int a, b;
a = (int) d; // Truncation is slow, value of a will be 1
b = lrint(d); // Rounding is fast, value of b will be 2
-----
In 64-bit mode or when SSE2 is enabled, the missing functions can be implemented as follows for 64-bit or when SSE2 instrcutions is enabled -
#include
static inline int lrintf (float const x) {
return __mm_cvtss_si32 (__mm_load_ss(&x));
}
static inline int lrint (double const x) {
return __mm_cvtsd_si32(__mm_load_sd(&x));
}
-----
Normally, all conversions from FP numbers to integers use truncation towards zero, rather than rounding, and truncation takes much longer cycles than rounding. It is better to improve efficiency by using rounding instead truncation. Above, SSE2 basedis faster.
~BR
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim excellent catch. Ill be sure to fix that one.
Jim If I test for integer/not integer, then I already have to keep most of my code; so why not use the existing one-line calc that combines the sign, real, decimal, and exponent parts? At least I know what to expect with this thing. I think Ill dig deeper with internal reads and TABs; one of the first things we do when debugging a customers input is to yell at them if they used TABs, because we told them not to.
While you may tell the customer not to use tabs, tabs may come in indirectly and out of the customer's control. When data comes out of Excel spreadsheet it may or may not have tabs. Also, the excel FP format may have specifications that are different from Fortran input specifications.
12345.67D00
May be a problem in Fortran (Fortran wants E regardless if you are reading into REAL*4 or REAL*8).
I am suggesting that as long as you are digging around in that section of code, you might as well fix it now. That is unless you want additional billable hours later.
The other thing you should consider doing is, provided the user data is not still on punched cards, the data is likely to be delimited as opposed to fixed fields. Write your input routine to token-ize the input such that it is not sensitive to field width or separator (comma, tab, space, other??).
Jim

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page