- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am havingan access violation when Icompile my code as a release version and run it. Interestingly, this run-time error does not show up in debug mode so I am having a hard time trying to figure out the cause of the problem. I am hoping to get some help here.
Here is an explanation of my problem: I am calling a function (Func1) that accepts 1 integer and2 character arguments (StrOne and StrTwo). Func1 resides in a MODULE. Then Func1 calls a subroutine (Sub1) twice. In the first call to Sub1 two integer values are returned based on the value of StrOne. In the second call to Sub1, twoadditional integer values are returned based on StrTwo.Sub1 also resides in the same MODULEas Func1. It appears that my problem occursduring the second call to Sub1.It appears that the first call to Sub1 returns correct values. The 'access violation' error occurs during the second call to Sub1.Again, this error occurs only in the release mode, not in the debug mode.To debug the release version, Iprint out sent and returned arguments before and after each call to Sub1. Below is the code snippet:
FUNCTION
Func1(DELTAT,BT,ET) RESULT(NPeriods)INTEGER,INTENT(IN)::DELTATCHARACTER(LEN=*),INTENT(IN)::BT,ETINTEGER::NPeriods!Local variablesINTEGER,PARAMETER::IFLAG=0INTEGER::BJD,EJD,BMM,EMMINTEGER,EXTERNAL::NOPERS!debug
write (*,*) 'debug0'
write (*,*) BT,BJD,BMM
write (*,*) ET,EJD,EMM
!debug
CALL Sub1(BT,BJD,BMM)!debug
write (*,*) 'debug1'
write (*,*) BT,BJD,BMM
write (*,*) ET,EJD,EMM
!debug
CALL Sub1(ET,EJD,EMM)!debug
write (*,*) 'debug2'
write (*,*) BT,BJD,BMM
write (*,*) ET,EJD,EMM
!debug
NPeriods=NOPERS(DELTAT,IFLAG,BJD,BMM,EJD,EMM)
END FUNCTION Func1
SUBROUTINE Sub1(T,JD,MM,STAT)CHARACTER(LEN=*),INTENT(IN)::TINTEGER,INTENT(OUT)::JD,MMINTEGER,OPTIONAL,INTENT(OUT)::STAT!Compute JD ..
.
!Handle error code
IF (PRESENT(STAT)) THEN.
.
.
END IF
!Compute MM.
.
.
END SUBROUTINE Sub1
What I see with the above code is that after the first call to Sub1, values of BT and ET get modified
to some nonsense values even if they are INTENT(IN) arguments in both Func1 and Sub1, and this
seems to break the program. Again, this does not happen in debug mode and the program runs to
completion with no problems.
I understand that above explanation is very general. But I am still hoping that somebody might be able to
give me some pointers as to how I can fix this problem. For instance, are there any compiler options that
I might need to experiment with? Are there ways to better debug a release version executable?
If it maters, I am using the latest IVF 9.1 Compiler (build 034, I believe) with VS2003. Thanks for any help.
Jon
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The first suspect in such cases are uninitialized variables. Please also switch on /Fortran/Run-time/Check uninitialized variables (that should reveal that error in Debug as well) and see what you get there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jugoslav,
I followed both of your suggestions, but I wasn't able to replicate the error. Release mode ran fine when I generated the debug information.
As another desperate trial, I went into the properties of the individual MODULE file and set /Run-time/Runtime error checking to All as the first try and None as the second try (the default was Custom, but I can't find any information in the help file on Custom). They both worked. I guess I can live with this at this point, but having to change individual file'scompilation properties without understanding what is really happening bothers me.
By the way, this code runs fine with CVF version 6.6C. I have been trying to migrate it to IVF.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jon,
What happens if you add the STAT variable to the calls?
The idea is not to obtain the status, but instead fill out the argument list.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim,
Adding the STAT variable to the callsdid not help.
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try zipping your solution (without Debug and Release folders) and posting it here. Perhaps someone will have success... in both reproducing it and fixing it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>Uh, a textbook example of a heisenbug...
I bought that book a while back but now wherever I lookI cannot find it.
Back on track...
A couple months ago I was writing mixed language code and got the calling arguments specified incorrectly. Had experienced a similar problem to the original post. Would "work" in one configuration but not in a different configuration. I guess I didn't know what got trashed when it appeared to "work".
I think a "require interfaces" might yield some light on the issue.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Specifically declare an interface to the subroutine. See what happens.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
JimDempseyAtTheCove:
Specifically declare an interface to the subroutine. See what happens.
Error LNK2001: Unresolved external symbol "_SUB1"or something along these lines. John already said that the routines are in the same module; it's apparently not an interface problem, but a stack corruption or silent array-out-of-bounds access. Are there any assumed-size array arguments involved (perhaps in callees of sub1)? They might produce a silent out-of-bounds violation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had the same probelm a while back, and it turned out to be a missing argument for one of the calls in a seemingly unrelated part of the code. I had made an extensive change of architecture thoughout the soloution which required numerous simple changes to the calling sequence for some of the routines. I missed one, and it took a while to discover it. Debugging, automatic interface options, etc. were very little help.
If you don't find the problem it will hit your users later. I, too, thought to release it with a release compilation (i.e, with debug info genreated, etc.) but fortunately found the problem with more testing--had to put write statements throughout to narrow it down.....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If by declaring the interface you see the LNK2001 error then you may have multiple references to SUB1 that are not using the interface, and potentially a call at one of these references is stomping on the stack data (or corrupted the code). Or, you have multiple SUB1s. If there are no other references to SUB1 then you may have errored in declaring/using the interface.
Inserting and using interface can be as effective as a diagnostic tool as inserting WRITE's here and there in the code.
Another potential area to look at is if the compiler is inlining SUB1. I had a similar problem way back V8.nn where the inlined code stack relative references was skewed.
Lacking anything further. One diagnostic to insert in SUB1 is on entry, insert a WRITE to display the LOC(STAT). i.e. Prior to test for present. Without the interface, the FASTCALL might leave junk in R9 if he is running on x64, or a junk pointer on the stack. With junk pointer PRESENT(STAT) will return TRUE and then result in writing STAT to somewhere.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My point was that, if he declares the INTERFACE to Sub1, he will get an LNK2001, as it will refer to an external (non-module) Sub1 which doesn't exist.
Your original point likely was that John should move Sub1 out of the module and write the INTERFACE block for it instead. However, it didn't get through, at least not for me. In any case, that approach is a good shot, as it will cope with the possibility that the compiler erred in inlining or other kind of IPO. But I'd still prefer to see the full code by John.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jugoslav and Jim,
I was able to reduce the code to a minimum and isolate the problematic part. However, I am hesitant to post the code on thsi forum since I am using a third-party static library and I am not sure about their policies in terms of distribution. If you are interested to take a look at it, I can send you the code privately.
Jim mentioned ealiar that he had a similar problem in a mixed-language programming project. This got me suspicious because, I think, this third-party static library I am using is developed using mixed-language programming. But still, it seems that the problem occurs before the call to the function in the library.
Anyways, let me know if you are interested in seeing the code.
Thanks,
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay, after reading the Help pages I am starting to think that it is the static library I am using that may be causing the access violation. With my "reduced code" I am getting another message after access violation: "Stack trace terminated abnormally.". I should definitely contact the developers of the library.
But here is what suprises me: In my reduced code, when I print out the values of arguments before calling the function from the library I get correct values and the program runs. When I comment out theWRITE statements, the program crashes. Same compiler settings in both cases...Is there an explanation for this?
Jon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The static library is the most likely culprit -- either you got the calling wrong, or it's buggy itself. You can send me the code if you wish (jdujic#uns.ns.ac.yu), but without the library's source I doubt I could do anything smart.
Note that the default calling convention for IVF is changed from stdcall/string arguments at the end to cdecl/string arguments at the end of arg-list. In addition, "variables default to AUTOMATIC" is now the default.
I use the following trick for stack pointer checking (it is not conclusive, though, as out-of-bounds access can corrupt the stack while keeping the stack pointer intact):
! EXCheckStack.f90 - returns the address of a local variableIn the calling code:
! in order to verify that user-defined callback function did
! not screw the stack by argument number mismatch or calling
! convention mismatch. Must not be optimized.
RECURSIVE INTEGER FUNCTION EXCheckStack()
INTEGER, AUTOMATIC:: i
EXCheckStack = LOC(i)
END FUNCTION EXCheckStack
iESP = EXCheckStack()
call SuspiciousFunction(...)
IF (EXCheckStack() .NE. iESP) THEN
!Error
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jon
Another thing to look at is if someone screws-up pass by VALUE or REFERENCE.
If you are not explicitly specifying which then the calling convention uses some rules to pick what, and sometimes what you get isn't what you want.
In the area of argument passing also look at passed strings and arrays for overruns. Expect Heisenburg experiences when you insert diagnostics to look at the behavior.
In addition to Jugoslav's stack pointer tester you can also insert Hash code generation of some depth of stack of things the library shouldn't touch. e.g. at beginning of MAIN get the address of the stack pointer, save to global (module) variable, then just before where you suspect the corruption problem read the current stack pointer, then produce a hash code from the data between the two pointers. If the context changes, assume something is walking out of bounds. Then if corruption occures hopefully you can figure out who's code is responsible.
Good luck in your bug hunting.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim and Jugoslav,
Thanks for all your brainstorming on this issue. I chose the simplest approach andcontacted the developers of the library. It turned out that they were comfortable with giving me the source code. Now I have to pull my hair the next few days trying tocompile the library myself.
Jugoslav's simple code showed that the culprit was indeed the library. It turns out that they used IVF 9.1.3291 with VS2005 in compiling the library. Iam using IVF 9.1.34 with VS2003. I sent them a reduced code demostrating the problem along with my VS solution files. The guy over there said that when he opened up the solution VS converted it to 2005, re-compiled it and the code ran fine. Now this brings another question to mind: I assume thatVS version should not affect the final binary code generated (or does it?) sincethis is a task for the Fortran compiler. Then does this mean thatmaybe there is a backward compatibility issue with the compiler?
Jon
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page