Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28632 Discussions

Optimized and non-optimized executables give different results

Yaqi_Wang
Beginner
373 Views
Hello,

I met this problem in my project and able to identify it with a shorter version.

I compiled the following codewith IVF 11.1.065 twice.
One is with,
/nologo /debug:full /Od /gen-interfaces /warn:interfaces /module:"Debug\\\\" /object:"Debug\\\\" /traceback /check:bounds /libs:static /threads /dbglibs /c
(copied from VS property pages.)
another is with,
/nologo /O3 /Qipo /module:"Release\\\\" /object:"Release\\\\" /libs:static /threads /c

The code has four files.
File 1: the main
------------------------

program test

use Quadrilateral1

implicit none

call INIT_QUADRILATERAL1()

print *, quadrilateral1_rtcsr(-1)%ia, quadrilateral1_rtcsr(-1)%ja, quadrilateral1_rtcsr(-1)%a

stop

end program test
------------------------
File 2: some constant definitions
------------------------

MODULE DataKind

INTEGER, PARAMETER :: RKD = 8

INTEGER, PARAMETER :: NLEN = 4 ! length of default integers

INTEGER, PARAMETER :: RLEN = 8 ! length of RKD reals

INTEGER, PARAMETER :: PLEN = 32 ! pointer length

REAL(RKD), PARAMETER :: zero = 0.0_RKD

REAL(RKD), PARAMETER :: one = 1.0_RKD

END MODULE DataKind
---------------------------
File 3: a sparse-matrix utility module
---------------------------

MODULE SparseMatrix

!===========================================================================

USE DataKind

!===========================================================================

IMPLICIT NONE

!> Compressed Sparse Row format\\n

!! Note: non-zeros are sorted by their column indices in each row.

TYPE CSR

INTEGER :: n !< number of rows

INTEGER :: m !< number of columns

INTEGER :: nnz !< number of non-zeros

INTEGER :: nnz_capacity !< capacity of non-zeros

INTEGER :: mcount=5*NLEN+3*PLEN !< memory count in bytes

INTEGER, POINTER :: ia(:) =>null() !< (n+1) pointers to the beginning of each row of all non-zeros

INTEGER, POINTER :: ja(:) =>null() !< (nnz_capacity) column index of all non-zeros

REAL(RKD), POINTER :: a(:) =>null() !< (nnz_capacity) non-zeros

END TYPE CSR

CONTAINS

!> Create a zero CSR matrix\\n

!! Note: if the matrix exists, this operation reiniializes the matrix. Old matrix will be erased.

!! n, m being equal to zero is allowed.

SUBROUTINE csr_init(a, n, m, nnz_capacity)

IMPLICIT NONE

TYPE(CSR) :: a

INTEGER :: n, m

INTEGER, OPTIONAL :: nnz_capacity

INTEGER :: nnz, istat, i

a%mcount = 5*NLEN + 3*PLEN

IF (ASSOCIATED(a%ia)) THEN

DEALLOCATE(a%ia, STAT=istat)

IF (istat/=0) STOP

ENDIF

ALLOCATE(a%ia(n+1), STAT=istat)

IF (istat/=0) STOP

a%mcount = a%mcount + (n+1)*NLEN

IF (PRESENT(nnz_capacity)) THEN

nnz = nnz_capacity

ELSE

nnz = 0

ENDIF

IF (ASSOCIATED(a%ja)) THEN

DEALLOCATE(a%ja, STAT=istat)

IF (istat/=0) STOP

ENDIF

ALLOCATE(a%ja(nnz), STAT=istat)

IF (istat/=0) STOP

a%mcount = a%mcount + nnz*NLEN

IF (ASSOCIATED(a%a)) THEN

DEALLOCATE(a%a, STAT=istat)

IF (istat/=0) STOP

ENDIF

ALLOCATE(a%a(nnz), STAT=istat)

IF (istat/=0) STOP

a%mcount = a%mcount + nnz*RLEN

a%n = n

a%m = m

a%nnz = 0

a%nnz_capacity = nnz

DO i = 1, a%n+1

a%ia(i) = 1

ENDDO

DO i = 1, a%nnz_capacity

a%ja(i) = 0

a%a(i) = zero

ENDDO

RETURN

END SUBROUTINE csr_init

!> Create identity matrix in a

SUBROUTINE csr_identity(n, a)

IMPLICIT NONE

INTEGER :: n ! dimension

TYPE(CSR) :: a ! identity matrix

INTEGER :: i

CALL csr_init(a, n, n, n)

a%nnz = n

DO i = 1, n

a%ia(i) = i

a%ja(i) = i

a%a(i) = one

ENDDO

a%ia(n+1) = n+1

RETURN

END SUBROUTINE csr_identity

END MODULE SparseMatrix
-------------------------
File 4: another module using sparse-matrix module
-------------------------

MODULE Quadrilateral1

USE DataKind

USE SparseMatrix

INTEGER, PARAMETER :: quadrilateral1_pmax = 7 ! maximum polynomial order

INTEGER, PARAMETER :: quadrilateral1_maxndofs = (quadrilateral1_pmax+1)*(quadrilateral1_pmax+2)/2

INTEGER, PARAMETER, PRIVATE :: rotm1(quadrilateral1_maxndofs) = &

(/ 1, &

-1, 1, &

1,-1, 1, &

-1, 1,-1, 1, &

1,-1, 1,-1, 1, &

-1, 1,-1, 1,-1, 1, &

1,-1, 1,-1, 1,-1, 1, &

-1, 1,-1, 1,-1, 1,-1, 1 /)

TYPE(CSR), SAVE :: quadrilateral1_rtcsr(-1:1)

CONTAINS

!> Initialize quadrilateral module

SUBROUTINE INIT_QUADRILATERAL1()

IMPLICIT NONE

INTEGER :: i

! Set up the rotation matrix

CALL csr_identity(quadrilateral1_maxndofs, quadrilateral1_rtcsr(1))

CALL csr_identity(quadrilateral1_maxndofs, quadrilateral1_rtcsr(-1))

DO i = 1, quadrilateral1_maxndofs

quadrilateral1_rtcsr(-1)%a(i) = quadrilateral1_rtcsr(-1)%a(i) * rotm1(i)

ENDDO

RETURN

END SUBROUTINE INIT_QUADRILATERAL1

END MODULE Quadrilateral1
-----------------------------------


The results in non-optimized mode:

1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
37 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
18 19 20 21 22 23
24 25 26 27 28 29
30 31 32 33 34 35
36 1.00000000000000 -1.00000000000000
1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000 -1.00000000000000 1.00000000000000
-1.00000000000000 1.00000000000000 1.00000000000000
-1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000 -1.00000000000000 1.00000000000000
-1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000 -1.00000000000000 1.00000000000000
-1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000 -1.00000000000000 1.00000000000000
-1.00000000000000 1.00000000000000 -1.00000000000000
1.00000000000000
which are the one I expected.

Results with the optimized version,
1 2 3 4 5 6
7 8 9 10 11 12
13 14 15 16 17 18
19 20 21 22 23 24
25 26 27 28 29 30
31 32 33 34 35 36
37 1 2 3 4 5
6 7 8 9 10 11
12 13 14 15 16 17
18 19 20 21 22 23
24 25 26 27 28 29
30 31 32 33 34 35
36 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000

They are different. So what is going wrong?

Thanks.

0 Kudos
3 Replies
Yaqi_Wang
Beginner
373 Views
From the optimization options, I do either of following, the error is gone.

1. add 'quadrilateral1_rtcsr(-1)%ia = quadrilateral1_rtcsr(-1)%ia' after csr_identity is called;
2. turn on /check:pointer;
3. turn on /check:bounds.

0 Kudos
mecej4
Honored Contributor III
373 Views
This appears to be an optimizer bug. It can also be observed with the 32-bit version of the recently released 11.1.067 version of the Windows compiler, and on the 32-bit 11.1.073 compiler on Linux-x64. However, if the optimization level is reduced to /O2, the bug goes away.

The bug is not seen with corresponding Intel64 versions (11.1.073 on Linux, 11.1.067 on Windows-64).
0 Kudos
Kevin_D_Intel
Employee
373 Views
mecej4 posted a smaller reproducer(here). I will keep this thread updated accordingly with new information about this issue as I learn it.
0 Kudos
Reply