/fp:strict vs /fp:precise or source - Page 2

lklawrie · ‎03-04-2009

I have a set of code that calculates the volume of a polyhedron based on vertices of the space.
With fp:precise or source in V11 compiler I get incorrect values. I believe these were correct in earlier versions of the compiler.
I still get correct results with /fp:strict.

I believe when I went to the 11 compiler I just changed "compilers" -- no change in settings.

Should I report this one?

Linda

Steven_L_Intel1 · ‎03-27-2009

It was certainly never your imagination! I looked to see if it is fixed in the next 11.0 update, but it is not - sorry. You do have a workaround until 11.1 is available.

a_leonard · ‎06-19-2009

I noticed something as I was trying to track down some differences in results of two models that I expected to be exactly the same. I was wondering if it was related to the problem reported in this thread.

To find out from where the difference in my results was coming, I started printing values of intermediate results using hexidecimal edit descriptors. The first time I see any difference at all in the two models is after the following code segment.

[plain]      AB(k) = 0.
      ZZ(k) = 0.
      do 546 i=1,i1
        AB(k) = AB(k) + AA(i,n)
        ZZ(k) = ZZ(k) + Z(i,n)
  546 enddo
      write(6,'(A,/4(8Z20/))') 'AA: ',(AA(i,n), i=1,i1)
      write(6,'(A,/4(8Z20/))') 'AB: ', AB(k)
[/plain]

The elements of the array AA that I print are all exactly the same, but AB(k) differs in the last bit or two.

Model 1:

AA:
3F87010CC22C0E5B 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9

AB:
3F87010CC22C0EF0

Model 2:

AA:
3F87010CC22C0E5B 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9
3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9 3C739983AEEF67E9

AB:
3F87010CC22C0EEE

When I use the compiler option /arch:IA32 for the file containing the above code fragment, the results of the two models are exactly the same, but different from above.

AB:
3F87010CC22C0EF1

I know the differences above are almost insignificant, but I don't want to spend time chasing my tail there is some compiler bug. Somehow the same executable is giving me different results when adding the same numbers. It seems to me that the only way that happens is if the terms are added together in a different order. Putting a print statement inside the loop must do something to the optimization, because then the results are identical.

Any idea when 11.1 will be released?

Steven_L_Intel1 · ‎06-19-2009

I see no evidence of a compiler bug. I do see evidence of a program that depends on extra precision above and beyond what is declared for the datatypes.

11.1 will be out next week.

TimP · ‎06-19-2009

Quoting - a.leonard

[plain]      AB(k) = 0.
      ZZ(k) = 0.
      do 546 i=1,i1
        AB(k) = AB(k) + AA(i,n)
        ZZ(k) = ZZ(k) + Z(i,n)
  546 enddo

[/plain]

Any of the options /arch:IA32 /fp:strict /fp:source /fp:precise /O1 should suppress sum reduction vectorization in that loop, and should give the same results, if all the operands are double precision. The default option to optimize with batched sums usually would give slightly more accurate results, as well as much greater performance. Unfortunately, the order of additions with this optimization will depend somewhat on alignments, which may vary on 32-bit Windows, depending on how you allocated the arrays.

a_leonard · ‎06-19-2009

Quoting - tim18

Any of the options /arch:IA32 /fp:strict /fp:source /fp:precise /O1 should suppress sum reduction vectorization in that loop, and should give the same results, if all the operands are double precision. The default option to optimize with batched sums usually would give slightly more accurate results, as well as much greater performance. Unfortunately, the order of additions with this optimization will depend somewhat on alignments, which may vary on 32-bit Windows, depending on how you allocated the arrays.

That't what I needed to know. One of my models allocates some exta memory, so the arrays I'm looking at must end up being aligned differently.