Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Huge heap needed

Bruce_Weaver
Beginner
495 Views

Hi,

I need to keep very large arrays in memory.  The computer has 132 GB & I need to keep a 10 -20 GB array in memory as it is in constant use.  I think the array should be put on the heap as I know the stack is limited.  What is the heap limit & do I use heap-arrays for this?

thanks

0 Kudos
23 Replies
andrew_4619
Honored Contributor I
444 Views

Make the array allocatable and it will go on the heap.

jimdempseyatthecove
Black Belt
444 Views

Keep in mind that you will need to use INTEGER(8) variables ** and literals when allocating and indexing the array. The integer(8) literal requires a kind suffix:

12345678901_8
12345678901_int64_t ! from USE ISO_C_BINDING
...

Additionally, if you have problem, you may need to check you Page File size limit.

Jim Dempsey
 

Steve_Lionel
Black Belt Retired Employee
444 Views

You may also need to use /heap-arrays if temps are created. It should go without saying that you also need to build for x64.

Bruce_Weaver
Beginner
444 Views

I asked the question because I get a terse 'can't do that because I'm out of memory' message after it executes a bit using the array as the program blows off.  It is because of the large array as everything works well if I don't use it or limit its size to a much smaller array.  It is allocatable, in a module, and temps are not created (I allocate it, fill it, & then only access it in the main program.).  With all that memory, I should not page (although I have had problems in the past with a program that wouldn't work w/o using paging even though there was plenty of memory).  I gather that there are no limits to heap size?

jimdempseyatthecove
Black Belt
444 Views

>>and temps are not created (I allocate it, fill it, & then only access it in the main program.). 

Depending on how you

YourHugeArray = expression

expression may generate a temporary (of size(YourHugeArray)). This is especially true if you use

YourHugeArray = ArrayExpression

as opposed to:

do I=1,YourHugeSize
   YourHugeArray(I) = ScalarExpression
end do

Jim Dempsey

Steve_Lionel
Black Belt Retired Employee
444 Views

Bruce Weaver wrote:

  I gather that there are no limits to heap size?

Yes, there is a limit, but you're unlikely to hit it.

What are the virtual memory addressing limits on 32-bit and 64-bit Windows?

JohnNichols
Valued Contributor II
444 Views

You say there is no limit, but with the Water Supply program I found the limit on max size of a program, lol even Windows has an end. Mind you a big one, but it does have its limitations, human needs to know there limitations. 

Bruce_Weaver
Beginner
444 Views

Hi John,

I have not been able to solve the large file problem, now just 0.5 GB.  I will end up, in a few months, with data files in the 20 GB file size domain.  Even though it is in an allocatable array, it interferes with other parts of the program that normally work.  Quantitatively, what limits have you encountered?

Jim:  I just read the large array in from a file...in the program, it is just read by the threads throughout the simulation.  For the purposes of this program the data are (physically) static.  All the computations, which are quite substantial, to create the values of the array are done before hand.

--Bruce

jimdempseyatthecove
Black Belt
444 Views
module staticData
  real, allocatable :: hugeArray(:)
  ...
end module staticData
======== new file =========
program yourProgram
  use staticData
  ...
  allocate(hugeArray(yourHugeSize))
  call fillWithData(hugeArray)
  ...
end program yourProgram
==== new file ====
subroutine foo(...)
   use staticData
   ...
   value = hugeArray(I)
  ...
end subroutine foo

When you need to have static data, you are effectively saying you need to have globally visible data (to those procedures that need to see it). To accomplish this, place the huge allocatable arrays in a module, allocate and initialize them if necessary at program start, then USE them.

Note, in older programs this (array placement and size) was typically performed through a named COMMON. Due to size restrictions on Linker sections (e.g. .data.) of 2GB, you cannot used named commons for huge arrays. Placing these arrays in a module (and allocating/initializing at program start) resolves this problem.

Jim Dempsey

LRaim
New Contributor I
444 Views

You can also have:
……

real (kind=xx), dimension(:), pointer :: yourHugeArray 
common/ChugeArray/yourHugeArray

and allocate the array as soon you can calculate the size required.

 

 

JohnNichols
Valued Contributor II
444 Views

Dear Bruce:

If you look at the picture attached, which is also shown better in a Magni post on this board, you can see the problem with large arrays and solving them.  mecej4 made Magni work in its current form, the issue became large dense arrays with a rank of 30,000. Magni is a water supply analysis program, which works quite nicely. The issue is the solution time, the only solver I have found that gives reasonable solution times for this size of problem - that is easy to use - is Pardiso.  The difference is 1000 fold on a Dell computer between the two main solvers I use. See graph. 

Pardiso requires a particular packing strategy and magni has the routines to do this - if you like send me a private post and I will send you the code.  mecej4 and Jim Dempsey solved that little challenge. 

Usually you find a solution -- it just takes a while.  

And sometimes you feel like an idiot, but that is ok. 

John

 

Bruce_Weaver
Beginner
444 Views

Hi John,

 

I think this is a different problem.  My large table just stores physics values calculated by another program which depend on a lot of complex physics and measured physical constants.  Since these values are static (as a function of a couple indicies for the table), no computation is done on the table values...they are just inputs to calculations being done by the main program.  My problem is strictly an issue  of the program not choking on the large  (many GB) arrays: 4 columns, 560560001 rows of single precision.

--Bruce

JohnNichols
Valued Contributor II
444 Views

Bruce 

The largest array I could create is a dense array of about 25000 by 25000 and then I ran into the WIndows physical  limits and there are no work arounds. 

the square root of 4 times 560560001 is 74,000 you are above the addressing limit -- I think you may as well climb Everest on the coldest day of the year. 

Do what we did in the 70s - only store limited data and leave rest on SSD.  It takes 6 seconds to read 6 million double precision numbers on a Intel i3 and an SSD - just did it - so unless you are speed desperate you need to try a different approach. 

 

Or store the data in a SQL server and just read as needed -- although the sql server will choke on this array 

 

 

 

Bruce_Weaver
Beginner
444 Views

Hi John,

 

I'm quite confused by this.  Steve's article claims that an allocated array can be up to 8TB; which seems not to be limited by win 7 professional or above.  It looks like you ran into an indexing limit (see Jim's recommendation to go to integer(8), which seems would be needed after about the limit of integer(4)= 2,147,483,647 elements in an array...which pretty much matches the numbers you cam up with.  Are you saying that going to integer(8) didn't help?

Actually, that is not my problem yet as I am only addressing 23,855,448 (4* 5963862) total elements.

Steve & Jim:  two questions: shouldn't the heap automatically expand as needed?

the allocation is allocate array(xmin:xmax,ymin:ymax,2) which I take to have delta x * delta Y * 2 elements.  I assume the memory map of this becomes a vector requiring an index of maximum magnitude of delta x * delta Y * 2 ?

I have a several other arrays multiple dimension arrays that are allocated, the largest of which is about 6e6 elements.  Could all these allocatable arrays be interfering with each other in some way?  The total size is much less than my 126 GB memory.

I'm including a Ram Map that might suggest something to someone.

thanks

Bruce_Weaver
Beginner
444 Views

addendum:  the map is of a version with a much smaller data set.  I could try to capture a map before it blows off in execution with a full sized array.

jimdempseyatthecove
Black Belt
444 Views

>>Steve's article claims that an allocated array can be up to 8TB; which seems not to be limited by win 7 professional or above

This is a Virtual Address capability and is subject to available system resources. The physical limitation is the available space in your page file and not necessarily available space in RAM. Depending on O/S, the heap may return a successful allocation of addresses, but the actual page file allocation is deferred until first touch. IOW an allocation may succeed (STAT=0) only to experience out of memory error when you fill the array.

>> shouldn't the heap automatically expand as needed?

Yes it does, but it can still exceed the capacity of the page file. (see 1st answer)

In looking at your SASi2.txt file (process map of "smaller dataset"), it shows 24 of the thread stacks, each of ~600MB. While this isn't necessarily wrong, it may lead to memory fragmentation.

A suggestion I have, is to start your program and before you enter your first parallel region, and before you do much work or initialization, perform your largest allocations and first touch (e.g. array=0.0). This will reduce any memory fragmentation (and gaps in committed page file pages).

Jim Dempsey

JohnNichols
Valued Contributor II
444 Views

 Steve's article claims that an allocated array can be up to 8TB

I can assure you that you will hit a maximum program size that WIN 10 will allow -- it exists as I found it and Steve then explained what the problem was -- there is a limit and it is not 8TB that I found. I cannot remember the exact limit but it is discussed on a post in this forum, but more than 25000 by 25000 dense array - ie every cell with a double - stops the program and you get a physical space or something error - this is why Pardiso and sparse array packing allows much bigger theoretical arrays 

 

I would agree with Jim, he has a lot of experience -- I was shown to allocate the arrays as a group together and then set to all zero at the start of the program by the people from here that helped with my earlier program - it works a treat with my Monte Carlo program and if it is going to crash will crash at the allocation or zeroing

Good luck 

John

Steve_Lionel
Black Belt Retired Employee
444 Views

The article is listing Windows' hard limits on the various types of data. Allocatable arrays are one type of dynamic data, but the article does not say that an allocatable array can be 8TB. The point is that Windows has different limits for static, stack and dynamic data.

The actual limit you will reach is indeed constrained by available pagefile space. After all, the data has to live somewhere...

Bruce_Weaver
Beginner
444 Views

Computer memory is 128 GB, paging (although I don't know why I'd need it) set to 100GB.  It is set in the drive I'm compiling in, not the C: drive.  The current 'large' data file I'm using is only 252 MB.  There are some other large allocatable files as well but none larger than this one.  The program runs for a while, then dies...it seems to be when is is trying to reference another allocatable data set.  It only fails if I have loaded the 'large' set.  If it were c code, I'd assume a memory leak.

I have had some trouble which I am now pretty sure stems from the Windows memory cache -- failure depends on what version of the code (by the same name) that I ran earlier.  Seems to work ok is I clear the memory cache but fails if I run the same problematic code twice.  I wonder if having several large arrays in the heap are confusing someone, perhaps in the memory cache; which I assume is completely under OS control.  Is there a way in Fortran to clear what I am now clearing in RamMap?  I've assumed that allocating arrays would defeat the OS memory cache.

thanks

jimdempseyatthecove
Black Belt
302 Views

Bruce,

There are a few factors that affect memory capacity issues on a Windows system.

1) Memory Leak. Usually caused by program not freeing an allocation. In Fortran this can be a result of using POINTERs for allocation an forgetting to DEALLOCATE the pointer. Or potentially caused (remotely) by bug in Fortran/C Runtime Library/O/S. Memory leaks can be observed by watching your process Commit Size in Task Manager (*** see 2 for additional information)

2) Memory Fragmentation which can be caused by an unfortunate choice of sequence for allocation and deallocation. While your heap choice of Low Fragmentation Heap can mitigate this to some extent, it cannot eliminate all cases of fragmentation. Fragmentation  can be observed by watching your process Commit Size in Task Manager. To correct for this you can either a) identify the culprits and reorder the allocate/dealocate (try to deallocate in reverse order of allocations), and b) restrict deallocation/re-allocation to only when an array requires to grow. *** Note, this may require you disable Reallocate Left Hand Side and coding using explicit array slicing (Array(1:n) = ...). This may also require you to eliminate heap generated array temporaries.

3) If your application uses a combination of C/C# and Fortran; and the C/C# code is threaded, and the Fortran uses OpenMP, be aware that the Fortran OpenMP system will instantiate a new thread pool when a different (formerly not used) thread ID is making the call. IOW do not fork/spawn a new thread and then call Fortran (many times).

4) If this problem occurs as a result of multiple program runs (as opposed to multiple iterations), then this could be a result of a) the O/S preventing the immediate re-use of previously used process memory for security reasons, or b) The O/S is maintaining a file cache when not needed or indexing directories when not desired, c) a different process used by your application (e.g. MySQL) builds a database without reclamation when items no longer are needed.

5) A 3rd party utility (i.e. RAMDISK) is being used and consuming all the RAM.

6) Your application uses Memory Mapped Files, and for some reason the file handles are not released and you run out of memory resources in the Fixed Page Pool.

>> Is there a way in Fortran to clear what I am now clearing in RamMap?

Exit the process (application) and restart (loop in a batch script).

Jim Dempsey

Reply