idb - program end without further information

oh_moose · ‎12-22-2007

If a program ends, then I would like to know where and be able to check the contents of all variables. Unfortunately all I get is a simple message like "Process has exited with status #". The debugger does not point to the location where the program ended, no traceback is available and no variables are accessible. That does not help. Did I miss something, or is there something missing in idb?

jimdempseyatthecove · ‎12-23-2007

oh_moose,

Often a program (your program) will contain defensive code (what are commonly called sanity checks).

IF(something wrong) STOP

Often a program written in haste will omit sanity checks

ALLOCATE(ARRAY(nElements))

Which abort during allocaiton

With defensive code

ALLOCATE(ARRAY(nElements), STAT=iErr)
IF(iErr .NE. 0) STOP

This is good practice. However, as you observed, issuing a STOP is not productive in debugging software errors.

What I prefer to do is to replace the STOP with a call to asubroutine named DOSTOP and which takes an argument.

IF(iErr .NE. 0) CALL DOSTOP('FileName - Memory Allocation Error')

And where

SUBROUTINE DOSTOP(CVAR)
 USE YourIoMod
 CHARACTER*(*) CVAR

 WRITE(IOUERR,*) CVAR
 WRITE(*,*) CVAR

 ! place break point here
 STOP 'DOSTOP'
 RETURN
END SUBROUTINE DOSTOP

Compile even the release version with debug information but compile DOSTOP without optimizations. DOSTOP is used in all versions.

When break occures you can examine variables and call stack. When you locate where the error came from and if due to optimizations you are unable to determine the cause of the problem then you can recompile the file (which produced or detected the error) with optimizations off. Then re-run to reproduce the error. Work your way out and back to the source of the problem. Once found and corrected, you can re-optimize those files again.

Note, if the error is potentially correctible you can place the "Next Statemen" at the RETURN then step out of the DOSTOP. Once in the calling subroutine you can move the "Next Statement" in an attempt to follow the programming sequence that caused the error.

Jim Dempsey

oh_moose · ‎01-04-2008

Hi Jim,

Happy New Year to you and everybody else here.

My software was written for VMS and now I am trying to make it work on Linux. I do use services like LIB$SIGNAL and LIB$STOP (if you are familiar with those) and have plenty of consistency checks in my source code. A microsecond spent on those checks saves me hours of debugging. I have a full implementation of these two VMS services for Linux (including a Message utility for VMS-style .MSG files, if someone is interested). It would be nice to have access to the ifort traceback routine (missing documentation). It would be even better to have the debugger stop the program at the right point. But the primitive Linux "backtrace" works as temporary hack for now and I can easily set a break point at LIB$STOP. That is not the problem.

The problem occurs when the Fortran RTL decides to end the program. On a VMS system at least I get the traceback and know where it happens. With the VMS debugger I may even have access to various variables (if they are not optimized away - hello, seen that gem before?).

jimdempseyatthecove · ‎01-04-2008

Oh_moose,

Welcome back,

Most internal IVF errors are echoed to the "console". If you run as a console application then the error messages would tend to hang around for you to read. If your application is windowed (non-console) then the message may end up in NULL. I believe you can set an environment variable to indicate a file name for an IOUNIT. You should be able to route the "console" output to a file for your review after the program takes a dive. If setting environment variables:

FOR_PRINT is stdout - write(*,f) iolistandprint f,iolist
FOR_TYPE is stdout - type f,iolist
FORT0 is stderr - write(0,f) iolist
FORT6 is stdout - write(6,f,) iolist

Depending on what the author used, one of those should have the error message. Hopefuly the author writes some message instead of simply stopping with an error number.

Jim Dempsey

jimdempseyatthecove · ‎01-04-2008

Also

STOP produces a call to "_for_stop_core". So you could potentially place a break point there. Don't quite know what DLL is going to do. You may also have to look at the other exit/abort/terminate library function calls and place break points on those as well.

Jim Dempsey

oh_moose · ‎01-05-2008

(Redirecting the output is not the issue here.)

Setting a breakpoint at for_stop_core looks a useful workaround (without a leading underscore here, good! and hey, DLL is Windows;-). I assume this is a routine which the Fortran RTL calls to execute a Fortran STOP statement. But this does not help if the Fortran RTL detects a problem and ends the program (no STOP involved here).

I tried to set a break point at "exit" (unix exit that is). Good thinking, Jim. However, the first time I got no useful traceback and all variables where optimized away. I have not managed to repeat this since the debugger keeps crashing. Maybe the debugger tells me to get a break.

Steve, you know how the VMS debugger works. That is what I want.

Steven_L_Intel1 · ‎01-07-2008

oh_moose, I am familiar with the VMS debugger but it has been many years since I used it. I don't recall being able to see any information after a program exited.

My advice for this situation, which I think I have offered before, is to write your own "stop" routine which you call instead of using stop directly. Then set a breakpoint on the routine. That routine could even call TRACEBACKQQ to issue a traceback. I do not recommend trying to set breakpoints on the library routines.

For a program that ends with an exception, I suggest you look at the SIGNALQQ library routine to see if this helps.

oh_moose · ‎01-07-2008

The VMS debugger does intercept exceptions and you can access all variables. I find this feature quite useful. For example, the Fortran RTL would produce something like
%FOR-E-OUTCONERR, output conversion error
(or something more serious which would eventually cause an end of the program execution)
and then you examine the corresponding expression.

Why would a self-written "stop" routine help in situations where the Fortran RTL ends the program with a message like "Process has exited with status 114"?

TRACEBACKQQ looks interesting. For the sake of completeness, in case someone else searches for this features and ends up here, I found the following web page contains some useful information.
http://www.intel.com/software/products/compilers/flin/docs/main_for/mergedProjects/bldaps_for/error_handling_ovw.htm

A break point at SINGALQQ does not intercept the exception "Process has exited with status 114". The program simply ends and the debugger does not show me where it ended, not to mention the lack of access to any variables since the program does not run anymore.

Steven_L_Intel1 · ‎01-07-2008

In the VMS debugger, I agree that if an exception is signalled and if the debugger gets control, all the stack frames are still active and you can get variable information. That is not the caseif the program exits with STOP, which seemed to be part of what you want.

My mention of SIGNALQQ was NOT to suggest that you set a breakpoint on it - that will do no good at all. Rather, SIGNALQQ is how you can supply your own exception handler, similar to LIB$ESTABLISH on VMS, though it is a "global" handler (like VMS vectored handlers) and not stack-frame based like VMS.

If you simply get a "process has exited" message, then either the program returned from the main program or something else unusual happened which bypassed the normal control mechanisms.

oh_moose · ‎01-08-2008

A break point set at for_emit_diagnostic allows the debugger to intercept any exception that produces a final "process has exited" message. I wish the Fortran RTL would check if the debugger runs and then transfers the control to the debugger.
Problem solved.

For my Linux implementation of LIB$SIGNAL I am going to add a bit of assembler code to check if the program runs under the control of the debugger and then perform an Int 3 to transfer control to the debugger for warnings, errors and fatal messages.

Thanks for reading my mind regarding LIB$ESTABLISH. I will need soon.

Steven_L_Intel1 · ‎01-08-2008

I encourage you to submit feature requests to Intel Premier Support. This is how we track them and tie them to customer requests. If you do so, please reference T70637-CP for user exception handling, and T80479-CP for having the debugger stop when an error is signalled by the RTL.