Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Optimization creates infinite loop

mecej4
Honored Contributor III
1,319 Views

Users of numerical analysis packages such as those at www.netlib.org, especially the codes in the ACM TOMS group, often find themselves contending with Fortran 77 subroutines and functions that are now obsolete because those were replaced with intrinsics such as RADIX, EPSILON, HUGE, TINY, etc., in Fortran 90 and later. Although the proper fix is to rework the old code, replacing the ad hoc pieces of code with RADIX, etc., that is not a chore that one wishes to undertake in every case -- often, we just want to build and run the old code with minimum effort.

Here is an example, where the user's purpose is defeated by the high quality of the IFort optimizer. I extracted a reproducer from www.netlib.org/toms, file 768.gz (TENSOLVE, a solver for simultaneous nonlinear equations). The test code is intended to output the floating point base (radix) of 2. With other compilers, and with /Od with IFort, it does. However, with /Ot or the default /fast, the program goes into an infinite loop, as did one of the test problem runs from TOMS-768 compiled with IFort.

      PROGRAM PBETA
      IMPLICIT NONE
      WRITE(*,*)'IBETA = ',IBETA()

      CONTAINS
      INTEGER FUNCTION IBETA()
C
C     returns radix(0d0)
C
      IMPLICIT NONE
      INTEGER ITEMP
      DOUBLE PRECISION A, B, TEMP, TEMP1
      DOUBLE PRECISION ZERO, ONE
      DATA ZERO, ONE/0.0D0, 1.0D0/

      A = ONE
      B = ONE
   10 CONTINUE
      A = A + A
      TEMP = A + ONE
      TEMP1 = TEMP - A
      IF (TEMP1-ONE .EQ. ZERO) GO TO 10
   20 CONTINUE
      B = B + B
      TEMP = A + B
      ITEMP = INT(TEMP-A)
      IF (ITEMP .EQ. 0) GO TO 20
      IBETA = ITEMP
      RETURN
      END FUNCTION
      END PROGRAM

The portion of the disassembly that corresponds to the lines from statement-10 to the IF statement four lines later is as follows. In effect, the optimizer replaces the IF statement before statement-20 with IF (0d0 .EQ. 0d0) GO TO 10. It similarly replaces lines 25 and 26 with, in effect, ITEMP = INT(B).

  00000076: 0F 28 CA           movaps      xmm1,xmm2     ; A=ONE
  00000079: 66 0F EF C0        pxor        xmm0,xmm0     ; should have been TEMP1 - ONE
  0000007D: 66 0F 2E C0        ucomisd     xmm0,xmm0     ; comparing 0D0 with 0D0
  00000081: F2 0F 58 C9        addsd       xmm1,xmm1     ; A=A+A
  00000085: 7A 02              jp          00000089      ; never happens
  00000087: 74 F4              je          0000007D      ; always true
  00000089: F2 0F 58 D2        addsd       xmm2,xmm2     ; B=B+B
  0000008D: F2 0F 2C C2        cvttsd2si   eax,xmm2

Lines 3 to 6 of the disassembly constitute the infinite loop. The loop is entered with xmm0 = 0 (from the PXOR), and xmm0 is never changed in the loop.

The compiler is allowed to make transformations and cannot be faulted based on standards-conformance. In fact, it could have continued its aggressive work and replaced the jp/je with an unconditional jmp.  Perhaps, it could have removed the ucomisd instruction, which is not needed any more. However, the compiler can, after optimizing the code, see that it has created an infinite loop, at which point it is probably worth issuing a warning to the unsuspecting user.

After all, when a program built from about 9000 lines of code "works" with competing compilers, hangs with IFort/default-options but works with IFort /Od, one may suspect a compiler bug. Nor is it trivial to locate the problem in the source code. What I did was to use the fsplit utility on the source files, and I then compiled all pieces with /Od for the base run. I then compiled each split file with /Ot, until I found the one that caused the infinite loop to occur.

0 Kudos
9 Replies
FortranFan
Honored Contributor III
1,319 Views

Deleted.

0 Kudos
Steven_L_Intel1
Employee
1,319 Views

Nope. I've seen this issue for many years, even back to the DEC and Compaq days.  This is just an incorrect assumption in the source code. The optimizer does value propagation and mathematically, TEMP1-ONE will always be zero. That it might not be so computationally is another thing. These routines that try to determine FP properties by making such assumptions are simply obsolete and should not be used. 

If you insist on using such routines, put them in a source compiled without optimization and /fp:strict - even then there are no guarantees.

0 Kudos
mecej4
Honored Contributor III
1,319 Views

I appreciate your position, Steve, so let me rephrase the request.

From the user's point of view the problem is that, even if "such routines" are suspected to exist, it is not easy to ascertain which of the many routines in a program is the culprit. It would help if a compiler option, say, "/LP:nnnn" would trap and print a message with an approximate line number for the location when the instruction pointer stays within +/- nnnn/2 bytes for, say, more than a second.

If that is infeasible, it would help have at least an option to make the run-time print the line number and file name after the user suspects an infinite loop or some such problem and aborts the program with a Ctrl+C. The present Ctrl+C handler does not give a traceback with line numbers even if the sources were all compiled with /traceback.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,319 Views

You should be able to use the debugger to attach to a running program (your hung program), then induce a break.

I agree with your sentiments that a program run without debugger but with debugging information should print out the routine address interrupted by Ctrl-C and the call stack if available. The typical end user might not be able to use the debugger but usually can be talked into doing a Ctrl-C.

Jim Dempsey

0 Kudos
mecej4
Honored Contributor III
1,319 Views

Thanks, Jim. On the user's machine there may be no VS installed. The VS1015 editions are very time-consuming to install, even if there is a free/evaluation version that could be used. The Windows Debugger (WinDbg) is smaller and serves the purpose, but even that is a 300 MB download.

Doing Ctrl+C does provide a stack trace, but it only covers the call chain from NTDLL.DLL to LIBIFCOREMD.DLL, and so is not useful at all.

0 Kudos
IanH
Honored Contributor III
1,319 Views

Note you don't have to have Visual Studio installed on a user's machine in order to debug an application on that machine under Visual Studio - a remote (same network) debugging facility is available. 

WinDbg has something similar.

0 Kudos
Steven_L_Intel1
Employee
1,319 Views

The problem with an automatic "break if infinite loop" is that there's no good way to know when it should break. Some programs may have an infinite loop that spans thousands of instructions and some programs may seem to be in a loop but will eventually exit. The notion that a developer would distribute a program without testing it seems absurd to me. And debuggers are a fundamental part of testing.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,319 Views

IanH's suggestion of remote debugging may be your only option (other than finding the bug on your development machine). I should have mentioned that a Ctrl-C handler might be useless in a hung multi-threaded application. For this you really need something like the debugger to break into the process and look around.

Jim Dempsey

0 Kudos
mecej4
Honored Contributor III
1,319 Views

Thanks for all the suggestions. This is not an everyday problem and can be managed one way or the other.

0 Kudos
Reply