- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
I have a problem with mathematicaloperations performed at a whole array, when I enable the Intel Processor Extensions (XP SP2, IVF10.0.027 with VS2003).
In my program I change array values not within a do loop but in one line:
real*4 array(100)
!... reading 100 values from file and save them in array
array=array/1000.
When I dont use the Intel Processor Extension as an optimization, everything works fine. But if I do, the value differs in the real*8 "section". The debugger wont work with the optimizations, so I wrote the data into a file.
Code:
array(1)=6500.00
write(19,'(a,f27.16)') 'before: ',array(1)
array=array/1000.
write(19,'(a,f27.16)') 'after: ',array(1)
Textfile:
before: 6500.0000000000000000
after: 6.5000004768371582
I recognized that there was i difference, when I compared another value with array(1) which should be equal, but they werent, although they seemed to be (real*4 of array(1) is 6.5000000 as well as the real*4 value, which I compared).
When I disable the optimizations, the textfile shows
after: 6.5000000000000000
My question is: Is that some sort of compiler bug or are operations like array = array / 1000 bad programming style? (I hope not, I use them a lot)
Thanks in advance,
Markus
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
write(19,'(a,f27.16)') 'before: ',array(1)
write(19,'(a,f27.16)') 'after: ',array(1)*0.001d0
The inequality of similar expressions is quite likely to occur when you compare expressions resulting from implicit promotion to double with those not involving promotion, i.e. there are many such situations in x87 code.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. You are displaying a REAL(4) value to many more digits than is appropriate for its precision. You get seven correct decimal digits, which is what I'd expect from REAL(4)
2. When you enable the processor extensions, the array operation is vectorized and is done with SSE instructions. These operate in declared precision (REAL(4) here) rather than double or extended precision that you would get for arithmetic operations without the SSE instructions. While this extended precision can give you "better" results, it is also inconsistent, depending on when the compiler chooses to round an intermediate result to declared precision.
There's nothing wrong with the array operation you did. What is wrong is assuming that you will get more precision out of single-precision than is warranted. Perhaps you want to use double-precision instead.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> This is not a bug
Perhaps, perhaps not.
6500 is expressed exactlyinternally in floating point
1000 is expressed exactlyinternally in floating point
1/1000 is an infinitely repeating (approximate)binary fraction
If the computation performed 6500/1000 then it should be exact
If the computation performed 6500 * (1/1000) then it should be approximate.
The optimization codemight be using the multiplication of the inverse of 1000 as opposed to using the division by 1000.
This is not a bug provided that the rules of expressions are known.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your replies.
My description was not good and I investigated my problem yesterday and this morning a little bit more.
The problem is that a comparism is true which should be false. I read certain data from a file. After that I convert them to European SI Units, in my case I have to divide the real*4 array endOfZone (endOfZone=endOfZone/1000.)
Before the calculation starts, I check the data, whether the user has changed them in a way he should not. One comparism is
if (overallLength.le.endOfZone(4)) then
! Error Message
overallLength is real*4. endOfZone(4) has to be less or equal overallLength (like the if clause says). In my case overallLength is 11.50000 and endOfZone(4) is 11.50000,the if clause should be false.
But the if clause is true when I
1) use Intel Processor Extensions On (/QaxK) AND
2) devide endOfZone as an whole array (endOfZone=endOfZone/1000.)
The if clause is false when I
1) dont use /QaxK (performance loss 80%) OR
2) use /fp:precise (performace loss 90%!!!) OR
3) use /Qprec-div- (no performance loss) OR
4) divide endOfZone in a do loop, endOfZone(i)=endOfZone(i)/1000.
This is still strange to me. Why is the if clause true when I divide the whole array? I did not mention in my prior post that I use the 32 bit compiler version.
Markus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is my code:
write(19,'(f19.12)') endOfZone(4)endOfZone=endOfZone/1000.
write(19,'(f19.12)') endOfZone(4)
write(19,'(z8.8)') endOfZone(4)
data of fort.19 (itsthe same with /Qprec-div- turned on or off)
11500.000000000000
11.500000953674
41380001
But with /Qprec-div- turned on, my if clause is false which it should.
So this is not a compiler bug, but things you have to get used to when you use compiler optimizations?
Markus
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In this case, it seems trivial to write x*0.001 or x*0.001d0, maybe even anint(x*0.001) if that is the intent. Certain major applications have the inversions written into source wherever they have been tested successfully for performance and accuracy, so that options /Qprec-div /Qprec-sqrt should be set.
As I mentioned a few days ago, /fp:precise includes the effect of the switches /assume:protect_parens /Qprec-div /Qprec-sqrt /Qftz- as well as disabling some additional optimizations which have minor numerical effects.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The observation of results not appearing decimally correct also occurs without optimizations as well. This example shows where the discrepancy between decimal math and binary math is different between using xmm floating point consistency whereas it does not appear using FPP floating point. You will find many cases where both are inconsistent.
The inconsistencies are due to truncation/rounding of infinitely repeating fractional values Of particular interest is the fraction 1./10. requires an infinite number of bits in binary to hold the correct value.
When testing for limits you should consider including an epsilon that is at least one significant bit (e.g. the position of the 1 in the 4138001).
Jim Dempsey
![](/skins/images/06022F5BB6D2F28C8F102671A0F06E85/responsive_peak/images/icon_anonymous_message.png)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page