Problem with overflow in vectorization, ifort 17.0.4

L_Excellent__Jean-Yv · ‎12-29-2017

Dear all,

I came across a segmentation fault when vectorizing a simple loop. It appears
that the problem occurs when the loop is vectorized and some indices (not directly the
loop index) to access the array accessed during the loop exceed 2^31 (although 64-bit
integers are used to address the array). The vectorization report (intel ifort version 17.0.4 20170411)
gives (among others) the message:

remark #15328: vectorization support: irregularly indexed load was emulated for the variable <a(j1)>, part of index is linear but may overflow

The problem also appears with ifort 17.0.1 20161005.
Everything works fine with ifort 16.0.1 20151021 even though the loop is vectorized.
The original code looks ok since 64-bit integers are used to address the array, which
appears to be large enough. I cannot print the faulty indices during the loop because
this inhibits vectorization. By rewriting the loop slightly differently (accessing A(J1+LDAFS8*(J-1))
in the code below directly and avoiding the line J1=J1+LDAFS8 inside the loop), the problem
disappears (together with the message "part of index is linear but may overflow" in the vectorization
report).

Is this a known issue? Should one rewrite all loops leading to this message with "overflow" in
the vectorization report?

I copy-paste below a piece of code showing the issue.

Thanks in advance for your help,
Best regards,
Jean-Yves

1/ File t.F containing a subroutine and the loop:

              SUBROUTINE VECTORIZE_BUG(A, LA, NASS,IEND_BLOCK,J1_ARG,
     &        RMAX_NOSLAVE)
              IMPLICIT NONE
              INTEGER(8) :: J1_ARG, LA
              DOUBLE PRECISION :: RMAX_NOSLAVE, A(LA)
              INTEGER :: NASS, IEND_BLOCK
              INTEGER :: LDAFS
              INTEGER :: J
              INTEGER(8) :: LDAFS8, J1
              LDAFS=NASS
              LDAFS8=int(LDAFS,8)
              J1 = J1_ARG + 1000000000_8
              RMAX_NOSLAVE=0.0D0
              DO J=1,NASS - IEND_BLOCK
                RMAX_NOSLAVE = max(abs(A(J1)),RMAX_NOSLAVE)
                J1 = J1 + LDAFS8
              ENDDO
              END SUBROUTINE VECTORIZE_BUG

2/ Main program:

         INTEGER :: NASS, IEND_BLOCK
         DOUBLE PRECISION :: RMAX_NOSLAVE
         INTEGER(8) :: J1
         INTEGER(8), PARAMETER :: LA = 2500000000_8
         DOUBLE PRECISION, DIMENSION(:), ALLOCATABLE :: A
         ALLOCATE(A(LA))
         J1 = 1022826928_8 ! => KO
c        J1 = 1222826928_8 ! => KO
c        J1 = 0822826928_8 ! => OK
         IEND_BLOCK = 32
         NASS = 16203
         A(J1+1000000000_8:LA)=0.0D0
         CALL VECTORIZE_BUG(A(1), LA, NASS, IEND_BLOCK, J1,
     &                      RMAX_NOSLAVE)
         END

3/ Commands used and result (everything ok with Intel 2016 or without vectorization):
% ifort -qopt-report=5 -qopt-report-phase=vec -O -g -traceback tmain.F t.F
% ./a.out

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source       
a.out              0000000000402E34  Unknown               Unknown  Unknown
libpthread-2.19.s  00007FE4B5AD58D0  Unknown               Unknown  Unknown
a.out              0000000000402B82  vectorize_bug_             18  t.F
a.out              00000000004029FF  MAIN__                     13  tmain.F
a.out              00000000004028AE  Unknown               Unknown  Unknown
libc-2.19.so       00007FE4B5538B45  __libc_start_main     Unknown  Unknown
a.out              00000000004027A9  Unknown               Unknown  Unknown