I have subroutine that allocates a number of arrays to temporarily store some data to do some output. In the RELEASE builds the code crashes with no traceback when executing an allocate. A check of the executable shows that the last error was a STATUS_HEAP_CORRUPTION. There have been lots of other arrays previously allocated and used. In the current code the size of the arrays being allocated is only 1. I've added write statements to determine where it is, but of course the write statements seem to change exactly which array causes the crash.
This is very large code, using 6-8 DLLs that also allocate memory. I'm using the 2020 compiler now. The previous version of the code running with 2019 didn't have any issues in this particular routine, and it hasn't changed in a while. Default KIND for integers and reals is 64.
Any suggestions on how to find the source of the crash?
The problem isn't the ALLOCATE statement - that's just where the error becomes visible. Something earlier in the program wrote in a place it shouldn't. The only way to resolve this is to debug the application. What I usually do is start disabling portions of code before the point of error to see what, if anything, makes the error go away.
The first thing I would do is enable all the compiler checks (warnings and run-time checks.) This may, unfortunately, cause the problem to move or hide. Data corruption is a tricky thing to track down.
Thanks Steve - You are right debugging changes the behavior, and I have all the checks that work active. I will say it is frustrating that it crashes seemingly because something wrote where it shouldn't have, but zero indication or information of what or where or when that writing is happening. Any clue at all would probably help to figure out where it occurred. I've had a case open in support for almost a year now about getting the debugger to show data values of many of my arrays. Resolving that case would make my life much easier because it is most likely directly related to the cause of this error.
Try the selective disabling of code. If the error happens after some number of calls write a log to dump intermediate state. I have slogged through many cases like this over the years - always found it, but sometimes it took days. You have to be methodical.
I might suggest Intel Inspector XE which has a feature to check for memory corruption, but it has a hard time with Fortran code with many false positives. Still, once in a while, it turns up something. You can get a 30 day free trial if you don't already have the Professional or Cluster edition.
Appreciate the suggestions. I have Professional so access to the important tools. I did try the inspector on our code last year, it had way to may problems dealing with our code that I couldn't get anything useful from it.
I always try to be methodical as it does help to narrow down the choices. In this case I decided to do a code walk just looking at indices for new features. As luck would have it I found a typo after looking into only a few percent of the new code that looks like the issue here. It caused an array to be written off the back, so will have to test some more to know for sure.
Thanks again. Hope you are enjoying retirement.