Community
cancel
Showing results for
Did you mean:
Highlighted
Beginner
29 Views

## Bug in intrinsic function sum() compiled with -fp-model precise

I'm running the latest ifort 2017.1 and found this unwanted behavior. It's illustrated by a short program and the result after two different compilations.

> cat test_sum.f
PROGRAM TEST_SUM
INTEGER(8),PARAMETER :: N = 1024 * 1024 * 1024
INTEGER(8) :: SEED(1) = (/12345678/)
REAL(4) :: A(N)
REAL(8) :: B(N)

!     Fill the array A with random numbers in the range [0 - 1].
CALL RANDOM_SEED(PUT=SEED)
CALL RANDOM_NUMBER( A )

!     Copy the values to the double precision array B.
B(:) = A(:)

!     Expected sum = N * 0.5 = 536870912 = 5.3687091E+08
!     The sum may differ a bit due to the random numbers drawn.
PRINT *, 'Sum(A):', REAL(SUM(A))
PRINT *, 'Sum(B):', REAL(SUM(B))

END PROGRAM TEST_SUM
>
> ifort -i8 test_sum.f
> a.out
Sum(A):  5.3684355E+08
Sum(B):  5.3687296E+08
>
> ifort -i8 -fp-model precise test_sum.f
> a.out
Sum(A):  1.6777216E+07
Sum(B):  5.3687296E+08

So the real*8 array was summed up correctly, but not the real*4 array when compiled with the -fp-model precise option. The result was the same if the arrays A and B were allocated arrays.

Tags (1)
4 Replies
Highlighted
Black Belt
29 Views

>>Expected sum = N * 0.5 = 536870912 = 5.3687091E+08

Worst case difference could be N lsb's as you accumulate each partial sum. Average may be N*0.5 lsb's

Float/Single has 7.22 digits of precision. 10^7.22 = 16595869 ~= 1.66E+7

When the partial sum of the float approaches and exceeds 1.66E7, the additional additions effectively add 0.0.
The summation of the float array is off by (5.3687091E+08 - 1.66E+7) ~= 5.2E+8. The actual difference will depend on when the partial sum approaches and exceeds 8.3E+6. The aove is thumbnail sketch, a mathematician can provide a proper statistical analysis.

The results appear realistic.

Jim Dempsey

Highlighted
Black Belt
29 Views

Try merging this into your program and report back the results:

```REAL(4) :: T(N/2) ! N must be even (it is)
REAL(4) :: SUMA
INTEGER(8) :: I
...
T = A(1:N/2) + A(N/2+1:N)
I = N / 2
DO WHILE(I .gt. 1)
T(1:I/2) = T(1:I/2) + T(I/2+1:I)
I = I / 2
END DO
SUMA = T(1)```

Jim Dempsey

Highlighted
Black Belt
29 Views

In case someone wonders why -fp-precise might be responsible for such a difference, that option requires SUM to be implemented as a single chain of serial addition.  Without that option, ifort can choose to optimize performance [ by splitting the sum into several partial sums (typically 8 individual 4- or 8- wide parallel sums, depending on target ISA), and this would delay the effect Jim described.

Highlighted
Black Belt
29 Views

After thinking about my post #3, it would have worked as intended if N were a power of 2. The following removes any restrictions on N (other than it be positive non-zero and fit in memory).

```REAL(4) :: T(N/2+1) ! No restrictions on N
REAL(4) :: SUMA
INTEGER(8) :: I,J
...
J = N
I = J / 2
T = A(1:I) + A(I+1:I*2) ! In event J is odd, sum all but last
IF(I*2 .NE J) THEN ! (if N (now J) were odd)
I = I+1
T(I) = T(J) ! copy last to after partial sum
ENDIF
DO WHILE(I .gt. 1)
J = I
I = J / 2
T(1:I/2) = T(1:I/2) + T(I/2+1:I*2)
IF(I*2 .NE J) THEN
I = I+1
T(I) = T(J)
ENDIF
END DO
SUMA = T(1)```

Performing the same thing on the double arrays would yield a better result.

Jim Dempsey