- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
When compiled with -Darray (making x an array in main), the first number in lines 5, 7, 9 of the output have more precision than they should (the double-precision x_dp should have the value of the single-precision dummy variable x_sp padded with zeroes after the promotion).
So is ifort keeping the result from random_number() in the extended-precision floating-point register and allowing the first x_dp=real(x_sp,dp) statement to pick up that value? Or is the subroutine add() being in-lined? I'm guessing in-lining because x_dp has the expected value promoted from x_sp if the main program and subroutine are compiled into two separate object files.
If the main program and subroutine are kept in the same file, but compiled without -Darray so that x is a scalar in main, then x_dp has the expected value of the promoted x_sp in subroutine add() when x_dp is printed the first time. Why do the use of x(i) and plain scalar x result in different values being picked up by x_dp in the subroutine?
I am using ifort 10.1.015 on an ia-32 running RHEL 5.2beta.
! compile with ifort -fpp -Darray -O2 -g test.f90
subroutine add(x_sp,s)
implicit none
integer,parameter::sp=selected_real_kind(6,37)
integer,parameter::dp=selected_real_kind(15,307)
real(sp),intent(in)::x_sp
real(dp),intent(inout)::s
real(dp)::x_dp
! type conversion
x_dp=real(x_sp,dp)
print"(1p,2e40.32)",x_dp,x_sp
! do type conversion again
x_dp=real(x_sp,dp)
print"(1p,2e40.32)",x_dp,x_sp
! next line doesn't really matter
s=s+x_dp
return
end subroutine add
program main
implicit none
integer,parameter::sp=selected_real_kind(6,37)
integer,parameter::dp=selected_real_kind(15,307)
integer,parameter::n=5
#ifdef array
real(sp),dimension(1:n)::x
#else
real(sp)::x
#endif
real(dp)::a=0.0_dp
integer::i
print*
do i=1,n,1
#ifdef array
call random_number(x(i))
call add(x(i),a)
#else
call random_number(x)
call add(x,a)
#endif< br> print*
end do
stop
end program main
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You imply that you're starting with a false premise.
Yes, the binary double precision value must be the same as the single precision value from which it is promoted, with binary zeroes appended. You seem to think the same must be true of the converted decimal values displayed by print. That wouldbe soonly when the conversion from binary to decimal is exact, not for random values.
If the run-time library follows IEEE standard, for the single precision display value it must derive the first 9 digits from the data, after which it mayfill with 0 digits, or it may take additional digits from implicitly promoted precision. The first 17 digits of the double precision displayed value must be derived from the data, not by following the logic of your code and surmising that you intended them to match your single precision display.
When you have the compiler in-line your subroutine, it is more likely to replace all single precision a values with double when you have a single scalar value, more likely when you don't ask for SSE code. You could suppress in-lining with -fno-inline-functions, or you could set one of the usual SSE options, such as -xW, so as to use different register formats for single and double.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My problem with the example program posted is that, with sufficient digits printed, I see that the promotion of a single-precision value to a double-precision value, as coded, results in unexpected changes in the original decimal digits as well as additional nonzero digits being appended to the decimal representation.
Let's take a look at the binary form of the numbers in question.
For example the third time the subroutine in my original example is called, I expect that the actual argument passed to the subroutine have the decimal value
3.525161445140838623046875e-1
which is the exact representation of the binary pattern
0:01111101:*01101000111110011111111
The colon : separates the pattern into the sign, exponent, and mantissa. The asterisk * represent the leading one-bit that is not explicitly stored for the normalized mantissa.
After the promotion, the decimal representation should not have changed, but it did.
3.525161445140838623046875e-1 (single-precision, as above)
3.52516147308051586151123046875e-1 (double-precision)
Here are the binary patterns of these numbers:
0:0---1111101:*01101000111110011111111-----------------------------
0:01111111101:*0110100011111001111111100011000000000000000000000000
The dashes - represent the binary digits that are not present in the single-precision value but are added when the promotion is made.
The two ones in red should be zero. Only zeroes are supposed to be padded to the end of the single-precision mantissa; where do these ones come from?
Here are the binary patterns for the fourth and fifth calls to the subroutine:
0:0---1111110:*01010101011101011101000-----------------------------
0:01111111110:*0101010101110101110011111011110000000000000000000000
0:0---1111110:*11101101000101011001110-----------------------------
0:01111111110:*1110110100010101100111000001010000000000000000000000
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It may be advisable with ifort 10.1 to use always one of the options which generates code for CPUs with at least Pentium III compatibility.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
With -Darray -O{2|3} -march={pentium|pentium2|notspecified} then I get the spurious one bits appended during the promotion. I have only tested the 32-bit compiler; the manual says that the 64-bit compiler assumes -march=pentium4.
The code I posted was simplied from a program to calculate statistical moments on the fly (instead of one-pass at the end) where the number of samples could run as large as 1e8 (Chan et al. Amer. Statistician 37:242, 1983), hence the accumulation of the sum and other code that I stripped out were done in double precision.
In practice I keep separate program units in separate object files and always compiler with -march=pentium4, so it's not really a problem for me. I just thought those extra one bits were strange and wanted to understand what was going on.
Thanks, Tim.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I did test the 64-bit ifort; as you point out, it has no option to generate x87 code.
After rebooting, I have seen the discrepancies on the 2nd through 5th elements of the array.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you want me to submit an issue to Intel Premium Support about this problem, please let me know.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page