- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear community,
I was hoping you might be able to give me some ideas or suggestions.
I've been trying to debug an application involving two DLL's, one written in C and one in Fortran, both compiled using Visual Studio.
I found recently that under rare circumstances, I get a read access error and crash. The crash seems to happen just as we go into the Fortran part of the code.
The unusual thing about this error is that it seems to happen under rather whimsical conditions. It seems to be related to whether certain arrays have odd or even dimensions. It does not happen in Debug mode. It does not happen in release mode if I specify /assume:dummy_aliases, or by specifying BoundsCheck="true". However, /assume:dummy_aliases sometimes leads to a stack overflow error (maybe using too much memory), and of course BoundsCheck="true" makes the DLL too slow.
I assume that there may be a problem with allocation or pointers or array bounds somewhere but I'm not sure how to find its location and nature since it doesn't occur during Debug mode. It also didn't seem to occur when I had been using an older version of the compiler (I use Visual Studio 2005 and recently upgraded to Intel Fortran 12). (However, I tried adding /iface:cvf and that didn't help).
In case it matters, I am using the /Qopenmp-link:static option which is now remarked as deprecated.
Thank you *very* much if you can provide any ideas.
Thanks again,
John
I was hoping you might be able to give me some ideas or suggestions.
I've been trying to debug an application involving two DLL's, one written in C and one in Fortran, both compiled using Visual Studio.
I found recently that under rare circumstances, I get a read access error and crash. The crash seems to happen just as we go into the Fortran part of the code.
The unusual thing about this error is that it seems to happen under rather whimsical conditions. It seems to be related to whether certain arrays have odd or even dimensions. It does not happen in Debug mode. It does not happen in release mode if I specify /assume:dummy_aliases, or by specifying BoundsCheck="true". However, /assume:dummy_aliases sometimes leads to a stack overflow error (maybe using too much memory), and of course BoundsCheck="true" makes the DLL too slow.
I assume that there may be a problem with allocation or pointers or array bounds somewhere but I'm not sure how to find its location and nature since it doesn't occur during Debug mode. It also didn't seem to occur when I had been using an older version of the compiler (I use Visual Studio 2005 and recently upgraded to Intel Fortran 12). (However, I tried adding /iface:cvf and that didn't help).
In case it matters, I am using the /Qopenmp-link:static option which is now remarked as deprecated.
Thank you *very* much if you can provide any ideas.
Thanks again,
John
Link Copied
6 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John,
This his one of those "heisenbugs" (when you look for them they go away, or observation changes state).
At the time of the read access you will have two pieces of information at hand: a) the location being accessed, and b) the location of the instruction attempting to perform the access.
If the read access error is to a location that is below address 0x1000 then you are likely passing in a NULL pointer. And in which case you may be able to identify the error with the addition of conditional compiled argument validation code.
If your read access error is due to alignment fault (SSE load of aligned data fails due to data not aligned), then you have compiler optimizations or code hints in error (stating data always aligned when in fact it is not).
If you do not know the source code line where the error occuredyou will have to do some sleuthing to associate the location of the instruction attempting to perform the access to the source code line in your code. If you do know the source code line where the error occurs skip the next paragraph.
Set dummy break point in DLL in known location that will trip a break, run to break,set the Disassembly window next statement at/near the location where the error is generated, press the VS yellow fat arrow, this should synchronize the source view windows, if it doesn't you may need to issue, from Disassembly window,a Step (which may crash, and which you do not care). After Step, the source code window should reflect your position in the DLL where the error occurs. Now you have source line number where error occurs.
Knowing source line number, you should be able to insert defensive code to detect the error will happen, and then call a subroutine (not optimized) that simply writes "bug found" and at which point you insert a break point. Re-run the program, cross fingers hoping that the bug is detected (as opposed to moves). When at break, you can step out (or refocus call stack) to the caller. Examine what youcan (which can be difficult due to optimizations), also note the call stack. If you suspect the incoming arguments were in error at the point where you detected the imminent bug, then find where the call with erroneous args was made from, and then devise a second test for arg to pass is in error and make a call to your "bug found" routine, re-run to trip error (one call level up). Repete this process moving the bug detection code to outer layers of the call stack until you get your "ah ha" moment where everything is clear as to what is going on.
Jim Dempsey
This his one of those "heisenbugs" (when you look for them they go away, or observation changes state).
At the time of the read access you will have two pieces of information at hand: a) the location being accessed, and b) the location of the instruction attempting to perform the access.
If the read access error is to a location that is below address 0x1000 then you are likely passing in a NULL pointer. And in which case you may be able to identify the error with the addition of conditional compiled argument validation code.
If your read access error is due to alignment fault (SSE load of aligned data fails due to data not aligned), then you have compiler optimizations or code hints in error (stating data always aligned when in fact it is not).
If you do not know the source code line where the error occuredyou will have to do some sleuthing to associate the location of the instruction attempting to perform the access to the source code line in your code. If you do know the source code line where the error occurs skip the next paragraph.
Set dummy break point in DLL in known location that will trip a break, run to break,set the Disassembly window next statement at/near the location where the error is generated, press the VS yellow fat arrow, this should synchronize the source view windows, if it doesn't you may need to issue, from Disassembly window,a Step (which may crash, and which you do not care). After Step, the source code window should reflect your position in the DLL where the error occurs. Now you have source line number where error occurs.
Knowing source line number, you should be able to insert defensive code to detect the error will happen, and then call a subroutine (not optimized) that simply writes "bug found" and at which point you insert a break point. Re-run the program, cross fingers hoping that the bug is detected (as opposed to moves). When at break, you can step out (or refocus call stack) to the caller. Examine what youcan (which can be difficult due to optimizations), also note the call stack. If you suspect the incoming arguments were in error at the point where you detected the imminent bug, then find where the call with erroneous args was made from, and then devise a second test for arg to pass is in error and make a call to your "bug found" routine, re-run to trip error (one call level up). Repete this process moving the bug detection code to outer layers of the call stack until you get your "ah ha" moment where everything is clear as to what is going on.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much! I will start working on this right away.
Thanks again,
John
Thanks again,
John
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Alternate way of locating the source line in the DLL.
Place a break point in your main program after the first call to the DLL (DLL is nowloaded).
Now open the Disassembly window and place a break point at the location where the errant instruction was located (you may have to enter a Ctrl-G then the hex address for goto).
Run to break.
This run to break will not necessarily present the circumstance where an error occures (this may be 100's or 1000's of calls later). However, the break should present you with a source synchronized view (assuming debug info included in DLL). Now you have the source line, and now you can insert the diagnostic code.
Jim Dempsey
Place a break point in your main program after the first call to the DLL (DLL is nowloaded).
Now open the Disassembly window and place a break point at the location where the errant instruction was located (you may have to enter a Ctrl-G then the hex address for goto).
Run to break.
This run to break will not necessarily present the circumstance where an error occures (this may be 100's or 1000's of calls later). However, the break should present you with a source synchronized view (assuming debug info included in DLL). Now you have the source line, and now you can insert the diagnostic code.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think I "sort of" found the location of the crash. I couldn't do anything with breakpoints because including enough debug information to use breakpoints seemed to be enough to put the Heisenbug back into hiding. But I was able to find the line where the crash took place, basically by having the program write a message to disk indicating its location after every few lines of code and then finding where it died.
It was a line involving some information being copied in a large dynamically allocated array.
anObject%aMember(:,:,:,a,b,c) = anObject%aMember(:,:,:,1,1,1)
I was able to turn the bug off by replacing the line above by elementwise copying, inside a loop over the first three indices. So whatever memory or boundary issue was required by the awkwardness of copying an array to an array didn't happen.
So I'm sort of happy, because our product seems to have stopped crashing. But I still feel uneasy because I know this is a Band-Aid and I still don't know what I did wrong to make the bug happen. Does this remind you of anything or give you any ideas of what I should look for?
Thank you very much!
Very sincerely,
John Dziak
It was a line involving some information being copied in a large dynamically allocated array.
anObject%aMember(:,:,:,a,b,c) = anObject%aMember(:,:,:,1,1,1)
I was able to turn the bug off by replacing the line above by elementwise copying, inside a loop over the first three indices. So whatever memory or boundary issue was required by the awkwardness of copying an array to an array didn't happen.
So I'm sort of happy, because our product seems to have stopped crashing. But I still feel uneasy because I know this is a Band-Aid and I still don't know what I did wrong to make the bug happen. Does this remind you of anything or give you any ideas of what I should look for?
Thank you very much!
Very sincerely,
John Dziak
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My guess is that you are running out of thread stack. You can try increasing the "Stack reserve size" in the Linker > System property page and then try setting larger values for the environment variable KMP_STACKSIZE in a command window before running your program.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you very much for your reply! I will look into this. Thanks again,
John
John

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page