Summation using XE 15

Tehuna__James · ‎12-07-2017

I am working on a legacy code. It is currently build using Intel Fortran 9.1. I was comparing the results using Intel Fortran XE 15 and was getting wrong results.

Part of the code which is causing the issue is pasted below:

        DO 1030 IM=1,NMOD
        AVG=0.0
        DO 1010 IT=1,NTIM
1010   AVG=AVG+GFB(IM)
        ATIM=NTIM
        AVG=AVG/ATIM
        DO 1020 IT=1,NTIM
1020   GNF(IM)=GNF(IM)+GFB(IM)-AVG
1030   CONTINUE

As you can see from the code That values within GNF should not really change. But with the XE 15 they do.

Here NMOD is 10;

The values of GNF and GFB before executing this part of code are:

       9.1                                    XE 15
IM       GNF               GFB           IM       GNF                GFB
1   -78411.73    -174.7768    1   -78411.74    -174.7768
2   -102429.1    859.1968    2   -102429.1     859.1973
3   4962.622 -29.82203 3   4962.624    -29.82202
4   116233.5 137.9382    4   116233.4 137.9379
5   64995.39    1337.103 5   64995.35    1337.104
6   -17947.88    -201.2107   6   -17947.88    -201.2108
7   -1605.297    8110.017    7   -1605.297    8110.014
8   -315.4255 3924.545    8   -315.4255    3924.543
9   59.31363    1423.356 9   59.31358    1423.356
10   704.9843 -2943.285 10   704.9835   -2943.287

While the values of GNF and GFB after executing this part of code are:

       9.1                                    XE 15
IM       GNF               GFB           IM       GNF                GFB
1    -78411.79    -174.7768    1    -78411.74   -174.7768
2    -102428.9 859.1968      2      -102541.6   859.1973
3    4962.622 -29.82203      3   4959.108   -29.82202
4    116233.5 137.9382        4   116233.4   137.9379
5    64995.39 1337.103         5       65178.16   1337.104
6    -17947.88    -201.2107     6   -17968.97   -201.2108
7    -1605.297      8110.017      7      -1556.078   8110.014
8    -315.4255      3924.545      8      -906.0505   3924.543
9    59.31363      1423.356       9        101.9405   1423.356
10    704.9843    -2943.285      10   755.0811   -2943.287

Also, value of AVG after 1010 should be 3600*GFB (since NTIM is 3600). But in XE 15 I am getting a bit different value.

Below are the values of AVG:

       9.1                                    XE 15
IM       GFB               AVG           IM       GFB                AVG
1   -174.7768    -629196.4     1   -174.7768   -629189.2
2   859.1968 3093108    2   859.1973   3093228
3   -29.82203    -107359.3     3   -29.82202   -107355.2
4   137.9382 496577.4         4   137.9379   496575
5   1337.103 4813571        5   1337.104   4813386
6   -201.2107    -724358.4    6   -201.2108   -724341.1
7   8110.017 2.92E+07      7   8110.014   2.92E+07
8   3924.545      1.41E+07      8   3924.543   1.41E+07
9   1423.356   5124080      9   1423.356   5124038
10   -2943.285   -1.06E+07     10   -2943.287   -1.06E+07

Can anybody help me figure out what might be causing this? Thanks.

andrew_4619 · ‎12-07-2017

https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler

mecej4 · ‎12-07-2017

James, if you expect more than just a "May be, so what?" response, you will need to provide a more complete report. Here are my observations:

Type declarations of variables are not shown. In particular, are the real variables single or double precision?
The values in array GFB, prior to executing the code shown, are not the same in the two runs (IVF9.1 and IVF15), This causes difficulties in comparing the results after the code block is executed, since there are at least two sources of discrepancies instead of one -- the compiler version.
The compiler options were not reported. You can expect different results if the code is generated for X87 and for SSE2.

Here is a complete test program:

program ftst
implicit none
integer i,j,nmod,ntim,im,it
real gnf0(10),gnf(10),gfb(10),avg,atim
data nmod/10/, ntim/3600/
data gnf0/-78411.73 ,-102429.1,4962.622,116233.5,64995.39,-17947.88, & 
   -1605.297,-315.4255,59.31363,704.9843/
data gfb/ -174.7768,859.1968,-29.82203,137.9382,1337.103, &
          -201.2107,8110.017,3924.545,1423.356,-2943.285/
DO IM=1,NMOD
   AVG=0.0
   DO IT=1,NTIM
      AVG=AVG+GFB(IM)
      ATIM=NTIM
	end do
    AVG=AVG/ATIM
    DO IT=1,NTIM
       GNF(IM)=GNF0(IM)+GFB(IM)-AVG
    end do
end do
write(*,'(I3,2x,2G15.7)')(i,gnf0(i),gnf(i),i=1,nmod)
end program

This program, when compiled with /fltconsistency, gave identical results when I used (all 32-bit)

CVF6.6C
IVF9.1.039
IVF15.0.7

Tehuna__James · ‎12-07-2017

Thanks andrew_4619 and mecej4.

mecej4, Thanks for your code. I do agree i just posted the part of the code where I found the issue.

I ran your test program and I got gnf0 and gnf to be identical. but just making a small change (see attached), I do not get the right answers.

Do you think this is causing the issue?

andrew_4619 · ‎12-08-2017

Well "so what"? You are doing a few thousand mathematical operations on some real numbers that in a theoretical world return you to the original number but are seeing differences on the 6 or 7 significant figures which is the limit of the precision of the real number kind you are using.

1] Are these differences significant to the physical reality of what you are calculating?

2] If these differences are significant then use real numbers with a higher kind eg use 64 bit real real(8) rather than 32 bit real , Real(4)

3] If they are not significant write the answer to a number of significant figures that is meaningful and they become identical!

mecej4 · ‎12-08-2017

James, please read the article that Andrew referred to in #2. Here is an even shorter program that you may find useful to understand the issues.

      program ftst
      implicit none
      integer i,it
      real gnf0,gnf,gfb,avg
      data gnf0/0.0/, gfb/3333.333/
	  gnf=gnf0
      AVG=0
      DO IT=1,3000
         AVG=AVG+GFB
      end do
      AVG=AVG/3000
      DO IT=1,3000
         GNF0=GNF0+GFB-AVG
      end do
      write(*,'(3G15.7)')gnf,gnf0,gfb-avg
      end program

Run 1: Use Ifort 2015, 32-bit, compile and run with options /O2 /arch:IA32 and see the result:

   0.000000       0.000000       0.000000

Run 2: Same compiler, options /O2 /arch:SSE2:

   0.000000      -8.056641     -0.2685547E-02

What is happening is that, with /arch:IA32, the floating point calculations are performed using X87 FPU instructions. Intermediate results are kept in the 80-bit registers of the FPU. On the other hand, with /arch:SSE2, the XMM registers are used, and the register operands are only 32-bits in size (23+1 bits for mantissa, 8 bits for biased exponent, 1 for sign). The number 3333.333 cannot be represented exactly in either 32-bit or 80-bit IEEE format, and this error when multiplied by the number of repetitions becomes striking. The old aphorism, "accumulate sums in double precision when possible", is quite apt.

You can also try using the options /arch:IA32 /fp:strict.

If your legacy code is afflicted with sensitivity to these issues, you have three options:

Make sure to use /arch:IA32, and note that this bars entry to the 64-bit world, OR
Make extensive changes to your code, changing to double precision at least those variables into which long sums are accumulated
Change all REAL variables and constants in your code to DOUBLE PRECISION.