Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Optimizer bug

mecej4
Honored Contributor III
875 Views

The following test program gives incorrect results when options such as /Ot, /O2, /fast are used on Windows 8.1 with the 32 and 64-bit IFort compiler version 15.0.4.221. Some earlier versions of the compiler, such as 11.1.070, give correct results whether or not optimization is requested.

program chk
    implicit none
    integer :: nmtdata=240,nmtfreq=60,i,j,k
    integer :: imtfreqnum(240),indexmtfreq(60)
!
    data imtfreqnum/                                                   &
      1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5,      &
      6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9,10,10,10,10,      &
     11,11,11,11,12,12,12,12,13,13,13,13,14,14,14,14,15,15,15,15,      &
     16,16,16,16,17,17,17,17,18,18,18,18,19,19,19,19,20,20,20,20,      &
     21,21,21,21,22,22,22,22,23,23,23,23,24,24,24,24,25,25,25,25,      &
     26,26,26,26,27,27,27,27,28,28,28,28,29,29,29,29,30,30,30,30,      &
     31,31,31,31,32,32,32,32,33,33,33,33,34,34,34,34,35,35,35,35,      &
     36,36,36,36,37,37,37,37,38,38,38,38,39,39,39,39,40,40,40,40,      &
     41,41,41,41,42,42,42,42,43,43,43,43,44,44,44,44,45,45,45,45,      &
     46,46,46,46,47,47,47,47,48,48,48,48,49,49,49,49,50,50,50,50,      &
     51,51,51,51,52,52,52,52,53,53,53,53,54,54,54,54,55,55,55,55,      &
     56,56,56,56,57,57,57,57,58,58,58,58,59,59,59,59,60,60,60,60/
!
    data indexmtfreq/                                                  &
      1, 2, 3, 5, 6, 7, 9,10,11,13,14,15,17,18,19,21,22,24,25,26,      &
     27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,      &
     47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66/
!
    do i=1,nMTdata
        k=iMTFreqNum(i)
        do j=1,nMTFreq
            if(k==j)then
                iMTFreqNum(i)=indexMTfreq(j)
            endif
        enddo
    enddo
    write(*,'(A,/,(20i5))')'iMTFreqNum  ',iMTFreqNum(1:nMTData)
    end

The correct output is an increasing sequence of integers in the range 1 to 66, and is obtained using /Od. With optimization, the output consists entirely of 0-s.

This example code was synthesized on the basis of the problematic code given in this recent thread in this forum: https://software.intel.com/en-us/forums/topic/584169 .

I note that the inner DO loop could be replaced by iMTFreqNum(i)=indexMTfreq(k), but if we did so we would not see the optimizer bug at all.

0 Kudos
1 Solution
Kevin_D_Intel
Employee
875 Views

It turns out this was several versions ago. Someone with much more background/historical knowledge/know-how than me helped with this answer. The Pentium 4 was the first to support integer SSE instructions, but before that there was MMX. In a very dusty presentation we found our version 5.0 advertised “ - Automatic vectorization which would generate SIMD instructions for loops for Intel® MMX™ technology, Streaming SIMD Extensions (SSE and SSE2) ”. The Windows 5.0 compiler was released in December 2000; the Linux came some time in 2001. It looks like some of this support was tied to the /QxW option back then (/arch:SSE2 today).

We expect the 11.0 compiler supports vectorization of integers and that /arch:SSE2 was enabled by default. If you haven’t already, you could try see whether the vectorization report sheds any clues.

View solution in original post

0 Kudos
3 Replies
Kevin_D_Intel
Employee
875 Views

Thank you mecej4. We greatly appreciate your effort in isolating and providing the reproducer. I confirmed your findings. The defect appears to be within the vectorizer. FWIW, in addition to the simpler equivalent you note, placement of NOVECTOR directive ahead of the outer loop also sidesteps the defect.

While testing this I found the defect appears to be fixed in the upcoming 16.0 compiler release ; however, I also found it resurfaces with our latest internal development branch. Given that finding I submitted this to Development (internal tracking id below) to ensure this defect is analyzed further and a fix is confirmed.

I will keep your post updated on what I hear from Development.

(Internal tracking id: DPD200375251)

0 Kudos
mecej4
Honored Contributor III
875 Views

Thanks, Kevin. There is a related point about which I am curious -- it is a minor thing and not worth putting any effort into digging up the answer, but if you already know, it'd be nice to have the answer posted.

The originator of post 584169 (Haoping) said that he experienced the optimizer bug with the 11.1.048 compiler. I tried with 11.1.070, and could not get it to issue SSE instructions for integer arithmetic! The question is: which was the first version of IFort (Windows or Linux) with the ability to issue SIMD instructions for integer arithmetic?

Thanks.

0 Kudos
Kevin_D_Intel
Employee
876 Views

It turns out this was several versions ago. Someone with much more background/historical knowledge/know-how than me helped with this answer. The Pentium 4 was the first to support integer SSE instructions, but before that there was MMX. In a very dusty presentation we found our version 5.0 advertised “ - Automatic vectorization which would generate SIMD instructions for loops for Intel® MMX™ technology, Streaming SIMD Extensions (SSE and SSE2) ”. The Windows 5.0 compiler was released in December 2000; the Linux came some time in 2001. It looks like some of this support was tied to the /QxW option back then (/arch:SSE2 today).

We expect the 11.0 compiler supports vectorization of integers and that /arch:SSE2 was enabled by default. If you haven’t already, you could try see whether the vectorization report sheds any clues.

0 Kudos
Reply