At what point during debugging are Functions evaluated?

Chris_H_4 · ‎02-02-2016

I have some code that calls a passed-in function called YDDZIC inside a loop. For 22 iterations this works fine but on the 23rd it crashes out of the whole program when it hits the END statement of the YDDZIC Function. The table below shows a side-by-side comparison of the local variables on the 22nd (left) and 23rd(right) iterations. The obvious difference here is that BRTS and X_1 are undefined in the 23rd iteration, it is also notable that I5, IBODY0 and IX0 are negative, these are used as array indices.

However when I step through the code (also below) the debugger steps all the way to the END statement before it falls over and I was wondering why it doesn't fail sooner, is it that it actually performs the evaluation when the end statement is reached? That seems unlikely because the debugger needs to determine the intermediate variables values. Any other ideas why it doesn't fall over before reaching the END statement?

      ENTRY YDDZID ( XI )

C     Find the envelope spline bay containing XI.
      I = ISEG2

C     Use the spline coefficients to find area slope at XI.
      I5 = IBODY0 + (I-1)*5
      DX = XI - BODY(I5+1)
      DS =            BODY(I5+3) +
     +     DX*( 2.0 * BODY(I5+4) +
     +     DX*  3.0 * BODY(I5+5))

C     Ward function of X-XI
      DX = X(IX0+1) - XI
      Z  = DX*BR
      W  = (A0 + Z*(A1 + Z*A2)) /
     +     (B0 + Z*(B1 + Z*(B2 + Z*(B3 + Z*B4))))

C     Integrand in W&F linear phi' equation
      YDDZID = W*DS

      RETURN
*     ======>

      END

<Edit> I put the variables in as a table but they just came out as a list so I have attached an image of the variables</edit>

Arjen_Markus · ‎02-02-2016

I think you need to show more code - the function is actually an ENTRY, so it may very well be that things go wrong in the subroutine/function that holds that ENTRY.

Chris_H_4 · ‎02-02-2016

You are correct in that it does go wrong in the few lines above, however that was not really my question which was more about why it doesn't fall over sooner. Here is the rest of the function

FUNCTION YDDZIC ( BODY_1, X_1, BRTS )

      INCLUDE 'YDDZINCM.FOR'

      REAL  BODY(1), BODY_1(*), X(1)
      PARAMETER (A0=1., A1=0.0490768,  A2=0.0074994,
     +           B0=2., B1=1.59751756, B2=0.46596355, B3=0.07322506,
     +           B4=0.0070035,  SPI=1.7724538509)

C     +----------------------------------------------------------------+

      CALL XXMADDR( BODY_1, BODY, 4, IBODY0 )
      CALL XXMADDR( X_1,    X,    4, IX0    )

      BR  = SPI/BRTS

C     Next statement is a dummy statement to satisfy the compiler.
      YDDZIC = BR

      RETURN

Arjen_Markus · ‎02-02-2016

All the statements are evaluated/executed as you go past them. The program is not delaying the actual evaluation until return.

I have been puzzling over your question, but I do not quite understand it. So here is a counter question: how do you know that the program behaves differently under the debugger? From your description I understand that things go wrong at the end of the 23rd iteration - in both cases.

mecej4 · ‎02-02-2016

Your description and the fragments of information provided are too sketchy. We do not know what is in the include file, and we do not see the declarations of the variables. We do not know the sequence of calls.

It appears that a number of variables are referenced with no prior definitions that can be seen. Do you have the SAVE attribute specified on any of the local variables? What compiler options did you use? Is it feasible to post the whole code, with all include and data files?

Chris_H_4 · ‎02-02-2016

Arjen - thanks for taking the time to reply. I'm not saying it is different in the debugger it's just that I can step all the way through the routine until the END statement before it crashes and yet in the OP the code at line 8 - BODY(I5+1) is trying to find the -3726th element of the array BODY, why not fall over at that point, what is special about the "END" statement that is making it crash out?

mecej4 - sorry it is a huge, steaming mound of code and even if I could post it all (which I can't for various reasons) no one would be prepared to look through it all. There are a large number of variables that appear out of nowhere and even checking back up the call stack they do not seem to be declared. Yes \Qsave is set, FWIW here is the compiler args

/nologo /debug:full /Od /I"../../../../Source/Fortran/Includes" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\GPGS\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\YDSUBS\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\YDCOMMON\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\YDTATM\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Print\YDBCR1\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Print\YDBSF1\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\XXCOMMON\Debug/" /I"D:\_work\CAPS\CAPS\Projects\Fortran\Service\XXSUBS\Debug/" /DYDDCDM /d_lines /f77rtl /intconstant /fpscomp:general /debug-parameters:all /Qsave /names:uppercase /iface:cref /module:"Debug/" /object:"Debug/" /Fd"Debug\vc120.pdb" /traceback /check:pointer /check:uninit /check:arg_temp_created /check:stack /libs:dll /threads /dbglibs /c

andrew_4619 · ‎02-02-2016

Chris, You haven't made things easy for someone to understand and answer your question! Your original post makes reference to variables that cannot be seen in either code snippet and it is not clear (to me anyway) how the code snippets relate to each other. A single larger (complete) snippet that shows the context of the loop would help.

Are your arrays going out of bounds? Looking at the style of the code (fixed form, implicit typing, obsolete language features) I would imaging you have options that switch off a whole host of checking to enable it to compile....

Chris_H_4 · ‎02-02-2016

Thanks for replying Andrew but it's just not that easy to post something that includes all the stack. There is a loop in a Procedure that calls another procedure which has another loop.

The loop below makes a call to YDDZIQ and passes, amongst other things, YDDZIB which is a reference to a function

DO 50 ISEG1 = 2, N2

        XL = XU
        XU = BODY(1,ISEG1+1)

        CALL YDDZIQ( XL, XU, ETOL1, NDIM1, YDDZIB, DDQ, ERRDDQ, AUX1 )

        DQ           = DQ + DDQ
        ERROR(1)     = MAX( ERROR(1), ERRDDQ )
        DSA(ISEG1-1) = DQ

   50 CONTINUE

YDDZIQ looks like this

      SUBROUTINE YDDZIQ( XL, XU, EPS, NDIM, FCT, Y, ERROR, AUX )

C Purpose
C       To compute an approximation for integral(FCT(x), summed
C       over x from XL to XU).
C Input
C       XL     - The lower bound of the interval.
C       XU     - The upper bound of the interval.
C       EPS    - The upper bound of the absolute error.
C       NDIM   - The dimension of the auxiliary storage array AUX.
C                NDIM-1 is the maximal number of bisections of
C                the interval (XL,XU).
C       FCT    - The name of the external function subprogram used.
C                It must be coded by the user. Its argument X
C                should not be destroyed.
C       Y      - The resulting approximation for the integral value.
C       ERROR  - See above.
C       AUX    - An auxiliary storage array with dimension NDIM.

      REAL    AUX(*)

      REAL    X, HH, HD, SM

C       Preparations of Romberg-loop.

      AUX(1) = 0.5*(FCT(XL)+FCT(XU))
      H      = XU-XL
      IF (NDIM-1) 8, 8, 1

    1 IF (H) 2, 10, 2

C       NDIM is greater than 1 and H is not equal to 0.

    2 HH = H
      E  = EPS/ABS(H)
      P  = 1.0
      JJ = 1

      DO 7 I= 2, NDIM
        Y  = AUX(1)
        HD = HH
        HH = 0.5*HH
        P  = 0.5*P
        X  = XL+HH
        SM = 0.0

        DO 3 J = 1, JJ
          SM = SM+FCT(X)
    3   X = X+HD

        AUX(I) = 0.5*AUX(I-1)+P*SM

C         A new approximation of integral value is computed by means of
C         trapezoidal rule. Start of Rombergs extrapolation method.

        Q  = 1.0
        JI = I-1

        DO 4 J = 1, JI
          II = I-J
          Q  = Q+Q
          Q  = Q+Q
    4   AUX(II) = AUX(II+1)+(AUX(II+1)-AUX(II))/(Q-1.0)

C         End of Romberg-step.

        DELTA = ABS(Y-AUX(1))

        IF (I-3) 7, 5, 5

    5   IF (DELTA-E) 10, 10, 7

    7 JJ    = JJ+JJ
    8 ERROR = DELTA/E
    9 Y     = H*AUX(1)


      RETURN
*     ======>

   10 ERROR = 0.0

      GO TO 9
*     =======>

      END

In there is the line SM = SM+FCT(X) where FCT is the reference to YDDZIB. In YDDZIB (several hundred lines of code) there is

DO ISEG2 = 2, ISEG1

C       Integration limits - extent of spline segment
        XL = XU
        XU = MIN( BODY(IBODY0+(ISEG2*5)+1), X )

C       Evaluate integral in linear phi' equation
        CALL YDDZIR( XL, XU, YTOL, NDIM, YDDZID, W1, ERRC, AUX(IAUX0+1))
        W = W+W1
        ERROR(2+IERROR0) = MAX( ERROR(2+IERROR0), ERRC )

      END DO

Which calls YDDZIR with a reference to YDDZID, which in turn calls

AUX(1) = 0.5*(FCT(XL)+FCT(XU))

In which FCT is a reference to YDDZID. YDDZIC and YDDZID looks like this

      FUNCTION YDDZIC ( BODY_1, X_1, BRTS )

    INCLUDE 'YDDZINCM.FOR'

      REAL  BODY(1), BODY_1(*), X(1)
      PARAMETER (A0=1., A1=0.0490768,  A2=0.0074994,
     +           B0=2., B1=1.59751756, B2=0.46596355, B3=0.07322506,
     +           B4=0.0070035,  SPI=1.7724538509)

C     +----------------------------------------------------------------+

      CALL XXMADDR( BODY_1, BODY, 4, IBODY0 )
      CALL XXMADDR( X_1,    X,    4, IX0    )

      BR  = SPI/BRTS

C     Next statement is a dummy statement to satisfy the compiler.
      YDDZIC = BR

      RETURN
*     ======>


C       Entry YDDZID - Evaluate W*SDASH at station XI.
*       ==============================================

      ENTRY YDDZID ( XI )

C     Find the envelope spline bay containing XI.
      I = ISEG2

C     Use the spline coefficients to find area slope at XI.
      I5 = IBODY0 + (I-1)*5
      DX = XI - BODY(I5+1)
      DS =            BODY(I5+3) +
     +     DX*( 2.0 * BODY(I5+4) +
     +     DX*  3.0 * BODY(I5+5))

C     Ward function of X-XI
      DX = X(IX0+1) - XI
      Z  = DX*BR
      W  = (A0 + Z*(A1 + Z*A2)) /
     +     (B0 + Z*(B1 + Z*(B2 + Z*(B3 + Z*B4))))

C     Integrand in W&F linear phi' equation
      YDDZID = W*DS

      RETURN
*     ======>

      END

mecej4 · ‎02-02-2016

Chris, try to see things from our perspective: you show a table of variable values, but you have not pointed out the current statement. Since the values of some variables may change after a statement, knowing the current statement is important. In the list of variables, those variables which have not yet been defined will have values displayed (such as -3726) without any validity or significance. With saved local variables, and possibly global variables being involved through the included file, it is perfectly reasonable that the function may work for n calls and fail during the n+1-th call.

If all your local variables are saved, bugs in other subprograms may have caused them to be corrupted, so the outlook for finding and fixing bugs is bleak in your case. With a large Frankencode, make up of swaths of code plastered in, I think that peering at variable values in a debugger is of little value, and thinking that things are fine just because you did not see the program crash is overly optimistic.

In order to resolve the problem, you may have to create a reproducer of modest size that replicates the problem. Doing this can be time-consuming, but it can be done. There have been several times where I have started with a misbehaving program with about 10 K lines of code and whittled it down to a 300-500 line reproducer.

Chris_H_4 · ‎02-02-2016

Again thanks for the reply, I was trying to work back from the point at which it fails.

The thing that concerns me at the moment and the thing I was trying to ask originally is that the debugger steps all the way through from Entry YDDZID until the END statement and then the program crashes out. It does not crash before that point, it only crashes at the END statement. This seemed odd to me because there is clearly some corrupt data but the debugger carries on anyway right up to the END statement.

So I was trying to ask if anyone knew why it would fail on reaching the END statement and not before.

Chris_H_4 · ‎02-02-2016

BTW the variable values in the attachment are as things stand at the RETURN statement.

mecej4 · ‎02-02-2016

With a compiled language such as Fortran, the mapping between the CPU instruction pointer (or code offset, if you prefer) and source code line number is inexact. As the optimization level increases, the mapping becomes hazier. I have often seen the current statement indicator move backwards, and even into declaration statements, when executing code in a symbolic debugger such as the VS debugger. If, in addition, the stack has become corrupted or array bounds have been exceeded, you are looking at the program through a fungus-afflicted lens. My opinion is that a symbolic debugger may not even be an appropriate tool to use until you have localized the problem and constructed a smaller reproducer.

The END and RETURN statements may correspond to machine code which releases the local stack, restores the caller's frame pointer to a value which is popped off the stack, and executes a RET instruction, using a value popped off the stack as the return address. If the stack has become corrupted, the RETURN statement is the place where spectacular things can happen.

andrew_4619 · ‎02-02-2016

OK I followed the code. I presume in the OP the locals are the values at the END of YDDZIC, but it is clear that variables are screwed up well before this point I guess the crash comes when it tries to clear up the screwed stack at the END statement. You need to find the point at which your vars get screwed up! As Mecej4 points out that may be quite hard work!

Array bound overruns, passing mismatched types, mismatched "interfaces" passing functions, will be the cause etc etc. Have you tried the check interfaces compile option? That might show up some interesting errors?

jimdempseyatthecove · ‎02-02-2016

This is what I see:

FUNCTION YDDZIC ( BODY_1, X_1, BRTS )
...
      REAL  BODY(1), BODY_1(*), X(1)
! *** BODY and X reserves 1 real, either on stack or SAVE depending on options
...
      CALL XXMADDR( BODY_1, BODY, 4, IBODY0 ) ! *** presumably writes to BODY
      CALL XXMADDR( X_1,    X,    4, IX0    ) ! *** presumably writes to X
...
      RETURN
...
      ENTRY YDDZID ( XI ) !*** XI not defined, possibly in your INCLUDE file?
...

      DX = XI - BODY(I5+1) ! *** BODY wasn't declared save, and only had dimension of 1
...
      DX = X(IX0+1) - XI   ! *** X wasn't declared save, and only had dimension of 1
...
      RETURN
...
END

Point 1) You did not specify compiler options such that we do not know if BODY and X are SAVE or not specified as SAVE.
Point 2) Until we saw your post #8, no one had a clue as to what BODY and X were (and a lot of other information)
Point 3) You clearly have indexing out of range.
Point 4) If the unseen subroutine XXMADDR is filling in subscripts other that 1 for BODY and X it is clearly in error.

Jim Dempsey

Chris_H_4 · ‎02-02-2016

mecej4, Andrew. Thanks for the pointers (sorry about the pun) but that is what I was looking for, the upshot being the problem is not in the YDDZID routine even though it manifests itself there and is likely to be related to memory issues corrupting the stack so I need to go looking elsewhere. If I turn on any more checking it either doesn't compile or stops long before I get this far.

Jim - thanks for the critique but I have tens of thousands of lines of code dating back 25 to 30 years like this. I accept that some details of this routine could be patched-up, missing variable declared etc. but I suspect that would only get me past this routine and I would encounter something similar in the next. Given that however incorrect the code it does actually work when built in CVF, ideally I like to fix this by finding the correct compiler options rather than changing the code.

jimdempseyatthecove · ‎02-02-2016

In your post #6 I see option /Qsave

This will place BODY(1) and X(1) into a private data space for subroutine YDDZIC. As to if BODY is first and X is second and/or if they are adjacent is implementation dependent. IIF for example X(1) immediately follows BODY(1), then and only then may have effectively EQUIVILANCE'd BODY(2) with X(1). Meaning indexing BODY out of bounds writes into X, and writing into X modifies out of bounds references to BODY.

Often, in old programs, when the size of the array was not known, you will see SUBROUTINE arguments specified with array size of (1) as opposed to array of unknown extent (*). This does require a compiler that does not perform bounds checking code.

In your case, BODY and X are local variables (or should I say now are local variables after modifications?). What is unknown is if BODY and X were intended to be EQUVILANCE'd and/or indexed higher than1 (which appears so). If they are not intended to be EQUIVILANC'd, then you may need to specify an array size for BODY and X that is .GE. that of the maximum possible size.

Jim Dempsey

mecej4 · ‎02-02-2016

Chris, I still think that you should consider posting the entire code+data for a test case. I understand that this may be impossible if your employer forbids this or parts of the code are proprietary.

If the code has bugs that did not surface when it was compiled with CVF, there is no guarantee that the bugs will remain dormant when you switch to a different compiler, even if you used options such as /Qsave to maintain compatibility with the behavior of CVF-compiled code. In other words, there is no compiler option that will magically suppress bugs in the code. Tens of thousands of lines of code is moderate size, and who knows, some reader of this forum may be willing to take it on.

The story may have a bright side, as well: if the code ran fine with CVF, it did not use many of the newer features of Fortran that were added to the language in the last decade or so, and it is quite likely that the bug can be found by using the improved error-checking capabilities of modern compilers.

Steven_L_Intel1 · ‎02-02-2016

Is this entirely a Fortran program or is it mixed-language? A flag went up for me when you say that you previously used CVF - if you converted the project from CVF, the default calling convention would be set to STDCALL. This would be ok if the entire program agreed to use STDCALL, but your reference to callbacks and passed procedures makes me wonder if at least some of the program assumes the Intel Fortran default of the C convention.

Try this - in the project properties, change Fortran > External procedures > Calling convention to "Default". (Set String Length Argument Passing to "Default" as well) and rebuild. The symptoms you describe are consistent with stack corruption due to calling convention mismatch.

[Caveat - if you are building a 64-bit (x64) program,, this does not apply.]

You should also turn on Diagnostics > Check Routine Interfaces.

Chris_H_4 · ‎02-02-2016

Jim - the call to XXMADDR makes an attempt to find the memory offset between the two arrays and returns it so BODY is effectively equivalence to BODY_1 by the offset returned in IBODY0. Body is not written to, and it is unclear to me why anyone would want to code things in this way, below is the code for XXMADDR, just for info this was discussed already at https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/606997

 SUBROUTINE XXMADDR( X_1, X, N, IX0 )

C Purpose
C     Find the element IX0 in the dummy array X corresponding to the
C     array X such that X(I+IX0) = X_1(I) for all I.

C Method
C     The function %LOC is used on the VAX, IG2LOC on the IBM.  These
C     find the addresses of X and X_1.
C     X_1, X must be non-character variables on the VAX in particular.

C Input
C  ?*N  X_1(*) - The array that is needed to be referenced via X.
C                 Usually this will have been passed into the previous
C                 subroutine through a dummy ENTRY point.
C  ?*N  X(*)   - Dummy array
C  I*4  N      - No. of bytes for each element of X_1 and X

C Output
C  I*4  IX0 - Location in X such that X(I+IX0) = X_1(I) for all I.

C Programmed by A.Penwill, June '91



      INTEGER  X_1(*), X(*)

      IDIFF = %LOC(X_1) - %LOC(X)

C       Test that the difference between the addresses of X_1 and X is
C       divisible by N.

      IF ( MOD( IDIFF, N ) .NE. 0 ) CALL XXERR( 'XXMADDR1', 26,
     1                                 'Alignment error in XXMADDR', 4 )

      IX0 = IDIFF / N

      RETURN
*     ======>

      END

Chris_H_4 · ‎02-02-2016

Steve - I have tried both "Default" and "C, REFERENCE (/iface:cref)" for the calling convention and it fails in the same place but you are correct that it is mixed code. There is no default for String argument length passing but I have tried both ways. although in these particular calls there are no strings passed.

If I turn on Diagnostics > Check Routine Interfaces it won't complains about "#6633: The type of the actual argument differs from the type of the dummy argument" in cases where the argument is for example INTEGER KINK(12) and the dummy argument is INTEGER KINK(*)

Steven_L_Intel1 · ‎02-02-2016

You should look closer at the error message - it is rarely wrong.

When you say it is mixed code - what is it mixed with? How are the Fortran routines declared in the other language and what compile options do you use for that language?