heap exhaustion

ifortman · ‎04-07-2010

Running ifort 64 version 11.1 on linux. Upon entering a subroutine the following error is generated:

chemsor (time=240002.56659331446, dol=Internal Error: out of memory
Heap exhausted with 28332032 bytes.
Intel Debugger for applications running on Intel 64 caught signal "Aborted" (6).
This is an unexpected condition and may indicate the presence of a defect.
If you wish to report this, please include the stack trace that follows.
/lib64/libc.so.6 [0x34b80302d0]
/lib64/libc.so.6(gsignal+0x35) [0x34b8030265]
/lib64/libc.so.6(abort+0x110) [0x34b8031d10]
/usr/lib64/libstdc++.so.5 [0x2ac679d2f046]
/usr/lib64/libstdc++.so.5 [0x2ac679d2f083]
/usr/lib64/libstdc++.so.5 [0x2ac679d2f096]
/usr/lib64/libstdc++.so.5(__cxa_call_unexpected+0x4e) [0x2ac679d2ef4e]
/usr/lib64/libstdc++.so.5 [0x2ac679d2f3be]
/usr/lib64/libstdc++.so.5(_Znam+0x9) [0x2ac679d2f449]
/opt/intel/Compiler/11.1/069/bin/intel64/iidb(_ZN4Bits10initializeEmPKc+0x36) [0x6b4392]

etc

How can this be fixed ?

jimdempseyatthecove · ‎04-08-2010

1) Examine your code for memory leaks.
2) Examine your code for allocation and dealocation sequencing that promotes fragmentation of heap.
3) Exaimine your code for run away recursion that performs allocations.

Your stack trace is relatively small so 3) is likely not a problem

Jim Dempsey

TimP · ‎04-08-2010

Does it respond to increases in stack limit in your shell, e.g. ulimit -s unlimited (or, if applicable, a system level stack limit)?

ifortman · ‎04-08-2010

I have been running with ulimit -s = unlimited

This problem happens upon an allocate command and does not happen with pgf90 compiled code. It also does not happen sith sunF95 compiled code. In that case, there is still a problem, however, in that calculations are not done properly in a certain different routine. I cannot track it down in sunF95 compiled code as for some reason dbx seems to be unable to print automatic arrays.

Thanks for your suggestion

Ron_Green · ‎04-08-2010

What compiler options are you using?

I suspect you may just be running out of memory. How much physical RAM do you have, and do you have any estimate of your dynamic memory requirement? Do you run

vmstat 1

in separate window when you run? This only has a 1 second resolution, so if you allocate all your data rapidly it may miss if you start to allocate too much memory.

ron

jimdempseyatthecove · ‎04-08-2010

Try this for hunting down the problem

Insert into your code statements which emits a trace of the allocations to the console window (or stderr). Since the problem seems to vary with compiler this is suspicious of uninitialized variables (causing overly large allocations). The allocation trace log may show something funny with the allocations.

Jim

ifortman · ‎04-09-2010

I have 12 GB of ram, which I believe is more than on other machines that I have run this program without a problem.

vmstat 1 yields the following:

r b swpd free buff cache si so bi bo in cs us sy id wa st
1 0 0 10337228 128484 1109788 0 0 0 0 1088 511 6 0 94 0 0
1 0 0 10337608 128484 1109788 0 0 0 0 1113 623 6 0 94 0 0
1 0 0 10337608 128492 1109780 0 0 0 80 1101 576 6 0 94 0 0
1 0 0 10337608 128492 1109788 0 0 0 0 1121 606 6 0 94 0 0
1 0 0 10337608 128492 1109788 0 0 0 0 1083 518 6 0 94 0 0
1 0 0 10338100 128492 1109788 0 0 0 0 1121 597 6 0 94 0 0
2 0 0 10336612 128504 1109412 0 0 0 17092 1244 564 6 0 93 0 0
2 0 0 10325200 128504 1109620 0 0 0 0 1115 2101 8 1 91 0 0
2 0 0 10317760 128504 1109620 0 0 0 0 1089 589 9 4 87 0 0
3 0 0 10221248 128572 1114336 0 0 616 0 1131 1062 12 3 84 0 0
3 0 0 10163836 128648 1161580 0 0 0 44 1099 632 16 3 81 0 0
2 0 0 10166316 128736 1174140 0 0 0 40 1159 752 13 0 87 0 0
2 1 0 10157512 128736 1174140 0 0 0 96992 1364 639 12 1 83 3 0

This includes the entry into the allocation block that triggers the error.

Top gives the following for resource use on a computer where it runs

virt res shr
18620 max 25 0 1908m 85m 1344 R 99.2 0.7 0:35.73 zeusmp.x

ifortman · ‎04-09-2010

Jim

Could you give me an example of how to emit a trace ?

Thanks

jimdempseyatthecove · ‎04-09-2010

real(8), allocatable:: array(:,:)
...
write(*,*) 'allocating array', nX, nY
allocate(array(nX, nY))

or variations on the theme

If you use FPP you additionly can include __FILE__ and __LINE__ in your trace

Jim

ifortman · ‎04-09-2010

Thanks for the information. It seems that your comment about unitialized variables is significant. I used the -save option and that problem has vanished. I still have to check whether the calculation is proceeding properly.

Thanks for your suggestions.

jimdempseyatthecove · ‎04-09-2010

caution

-save my make the symptom go away, but it may not necessarily make the problem go away. It may hide an insidious problem.

Find out what the problem is, and be certain that -save is correct action. It is better to explicitly mark those variables SAVE that require save as opposed to making all variables save.

Jim