Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software Development Tools (Compilers, Debuggers, Profilers & Analyzers)
- Intel® Fortran Compiler
- After thinking about my post

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

Lars-goran_Ofversted

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-17-2017
09:54 PM

8 Views

Bug in intrinsic function sum() compiled with -fp-model precise

I'm running the latest ifort 2017.1 and found this unwanted behavior. It's illustrated by a short program and the result after two different compilations.

> cat test_sum.f

PROGRAM TEST_SUM

INTEGER(8),PARAMETER :: N = 1024 * 1024 * 1024

INTEGER(8) :: SEED(1) = (/12345678/)

REAL(4) :: A(N)

REAL(8) :: B(N)

! Fill the array A with random numbers in the range [0 - 1].

CALL RANDOM_SEED(PUT=SEED)

CALL RANDOM_NUMBER( A )

! Copy the values to the double precision array B.

B(:) = A(:)

! Expected sum = N * 0.5 = 536870912 = 5.3687091E+08

! The sum may differ a bit due to the random numbers drawn.

PRINT *, 'Sum(A):', REAL(SUM(A))

PRINT *, 'Sum(B):', REAL(SUM(B))

END PROGRAM TEST_SUM

>

> ifort -i8 test_sum.f

> a.out

Sum(A): 5.3684355E+08

Sum(B): 5.3687296E+08

>

> ifort -i8 -fp-model precise test_sum.f

> a.out

Sum(A): 1.6777216E+07

Sum(B): 5.3687296E+08

So the real*8 array was summed up correctly, but not the real*4 array when compiled with the -fp-model precise option. The result was the same if the arrays A and B were allocated arrays.

4 Replies

Highlighted
##

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-18-2017
10:03 AM

8 Views

>>Expected sum = N * 0.5 =

>>Expected sum = N * 0.5 = 536870912 = 5.3687091E+08

Worst case difference could be N lsb's as you accumulate each partial sum. Average may be N*0.5 lsb's

Float/Single has 7.22 digits of precision. 10^7.22 = 16595869 ~= 1.66E+7

When the partial sum of the float approaches and exceeds 1.66E7, the additional additions effectively add 0.0.

The summation of the float array is off by (5.3687091E+08 - 1.66E+7) ~= 5.2E+8. The actual difference will depend on when the partial sum approaches and exceeds 8.3E+6. The aove is thumbnail sketch, a mathematician can provide a proper statistical analysis.

The results appear realistic.

Jim Dempsey

Highlighted
##

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-18-2017
10:17 AM

8 Views

Try merging this into your

Try merging this into your program and report back the results:

REAL(4) :: T(N/2) ! N must be even (it is) REAL(4) :: SUMA INTEGER(8) :: I ... T = A(1:N/2) + A(N/2+1:N) I = N / 2 DO WHILE(I .gt. 1) T(1:I/2) = T(1:I/2) + T(I/2+1:I) I = I / 2 END DO SUMA = T(1)

Jim Dempsey

Highlighted
##

In case someone wonders why -fp-precise might be responsible for such a difference, that option requires SUM to be implemented as a single chain of serial addition. Without that option, ifort can choose to optimize performance [ by splitting the sum into several partial sums (typically 8 individual 4- or 8- wide parallel sums, depending on target ISA), and this would delay the effect Jim described.

TimP

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-18-2017
11:39 AM

8 Views

In case someone wonders why

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

01-19-2017
05:39 AM

8 Views

After thinking about my post

After thinking about my post #3, it would have worked as intended if N were a power of 2. The following removes any restrictions on N (other than it be positive non-zero and fit in memory).

REAL(4) :: T(N/2+1) ! No restrictions on N REAL(4) :: SUMA INTEGER(8) :: I,J ... J = N I = J / 2 T = A(1:I) + A(I+1:I*2) ! In event J is odd, sum all but last IF(I*2 .NE J) THEN ! (if N (now J) were odd) I = I+1 T(I) = T(J) ! copy last to after partial sum ENDIF DO WHILE(I .gt. 1) J = I I = J / 2 T(1:I/2) = T(1:I/2) + T(I/2+1:I*2) IF(I*2 .NE J) THEN I = I+1 T(I) = T(J) ENDIF END DO SUMA = T(1)

Performing the same thing on the double arrays would yield a better result.

Jim Dempsey

For more complete information about compiler optimizations, see our Optimization Notice.