I need to keep very large arrays in memory. The computer has 132 GB & I need to keep a 10 -20 GB array in memory as it is in constant use. I think the array should be put on the heap as I know the stack is limited. What is the heap limit & do I use heap-arrays for this?
Keep in mind that you will need to use INTEGER(8) variables ** and literals when allocating and indexing the array. The integer(8) literal requires a kind suffix:
12345678901_int64_t ! from USE ISO_C_BINDING
Additionally, if you have problem, you may need to check you Page File size limit.
I asked the question because I get a terse 'can't do that because I'm out of memory' message after it executes a bit using the array as the program blows off. It is because of the large array as everything works well if I don't use it or limit its size to a much smaller array. It is allocatable, in a module, and temps are not created (I allocate it, fill it, & then only access it in the main program.). With all that memory, I should not page (although I have had problems in the past with a program that wouldn't work w/o using paging even though there was plenty of memory). I gather that there are no limits to heap size?
>>and temps are not created (I allocate it, fill it, & then only access it in the main program.).
Depending on how you
YourHugeArray = expression
expression may generate a temporary (of size(YourHugeArray)). This is especially true if you use
YourHugeArray = ArrayExpression
as opposed to:
YourHugeArray(I) = ScalarExpression
You say there is no limit, but with the Water Supply program I found the limit on max size of a program, lol even Windows has an end. Mind you a big one, but it does have its limitations, human needs to know there limitations.
I have not been able to solve the large file problem, now just 0.5 GB. I will end up, in a few months, with data files in the 20 GB file size domain. Even though it is in an allocatable array, it interferes with other parts of the program that normally work. Quantitatively, what limits have you encountered?
Jim: I just read the large array in from a file...in the program, it is just read by the threads throughout the simulation. For the purposes of this program the data are (physically) static. All the computations, which are quite substantial, to create the values of the array are done before hand.
module staticData real, allocatable :: hugeArray(:) ... end module staticData ======== new file ========= program yourProgram use staticData ... allocate(hugeArray(yourHugeSize)) call fillWithData(hugeArray) ... end program yourProgram ==== new file ==== subroutine foo(...) use staticData ... value = hugeArray(I) ... end subroutine foo
When you need to have static data, you are effectively saying you need to have globally visible data (to those procedures that need to see it). To accomplish this, place the huge allocatable arrays in a module, allocate and initialize them if necessary at program start, then USE them.
Note, in older programs this (array placement and size) was typically performed through a named COMMON. Due to size restrictions on Linker sections (e.g. .data.) of 2GB, you cannot used named commons for huge arrays. Placing these arrays in a module (and allocating/initializing at program start) resolves this problem.
You can also have:
real (kind=xx), dimension(:), pointer :: yourHugeArray
and allocate the array as soon you can calculate the size required.
If you look at the picture attached, which is also shown better in a Magni post on this board, you can see the problem with large arrays and solving them. mecej4 made Magni work in its current form, the issue became large dense arrays with a rank of 30,000. Magni is a water supply analysis program, which works quite nicely. The issue is the solution time, the only solver I have found that gives reasonable solution times for this size of problem - that is easy to use - is Pardiso. The difference is 1000 fold on a Dell computer between the two main solvers I use. See graph.
Pardiso requires a particular packing strategy and magni has the routines to do this - if you like send me a private post and I will send you the code. mecej4 and Jim Dempsey solved that little challenge.
Usually you find a solution -- it just takes a while.
And sometimes you feel like an idiot, but that is ok.
I think this is a different problem. My large table just stores physics values calculated by another program which depend on a lot of complex physics and measured physical constants. Since these values are static (as a function of a couple indicies for the table), no computation is done on the table values...they are just inputs to calculations being done by the main program. My problem is strictly an issue of the program not choking on the large (many GB) arrays: 4 columns, 560560001 rows of single precision.
The largest array I could create is a dense array of about 25000 by 25000 and then I ran into the WIndows physical limits and there are no work arounds.
the square root of 4 times 560560001 is 74,000 you are above the addressing limit -- I think you may as well climb Everest on the coldest day of the year.
Do what we did in the 70s - only store limited data and leave rest on SSD. It takes 6 seconds to read 6 million double precision numbers on a Intel i3 and an SSD - just did it - so unless you are speed desperate you need to try a different approach.
Or store the data in a SQL server and just read as needed -- although the sql server will choke on this array
I'm quite confused by this. Steve's article claims that an allocated array can be up to 8TB; which seems not to be limited by win 7 professional or above. It looks like you ran into an indexing limit (see Jim's recommendation to go to integer(8), which seems would be needed after about the limit of integer(4)= 2,147,483,647 elements in an array...which pretty much matches the numbers you cam up with. Are you saying that going to integer(8) didn't help?
Actually, that is not my problem yet as I am only addressing 23,855,448 (4* 5963862) total elements.
Steve & Jim: two questions: shouldn't the heap automatically expand as needed?
the allocation is allocate array(xmin:xmax,ymin:ymax,2) which I take to have delta x * delta Y * 2 elements. I assume the memory map of this becomes a vector requiring an index of maximum magnitude of delta x * delta Y * 2 ?
I have a several other arrays multiple dimension arrays that are allocated, the largest of which is about 6e6 elements. Could all these allocatable arrays be interfering with each other in some way? The total size is much less than my 126 GB memory.
I'm including a Ram Map that might suggest something to someone.
>>Steve's article claims that an allocated array can be up to 8TB; which seems not to be limited by win 7 professional or above.
This is a Virtual Address capability and is subject to available system resources. The physical limitation is the available space in your page file and not necessarily available space in RAM. Depending on O/S, the heap may return a successful allocation of addresses, but the actual page file allocation is deferred until first touch. IOW an allocation may succeed (STAT=0) only to experience out of memory error when you fill the array.
>> shouldn't the heap automatically expand as needed?
Yes it does, but it can still exceed the capacity of the page file. (see 1st answer)
In looking at your SASi2.txt file (process map of "smaller dataset"), it shows 24 of the thread stacks, each of ~600MB. While this isn't necessarily wrong, it may lead to memory fragmentation.
A suggestion I have, is to start your program and before you enter your first parallel region, and before you do much work or initialization, perform your largest allocations and first touch (e.g. array=0.0). This will reduce any memory fragmentation (and gaps in committed page file pages).
Steve's article claims that an allocated array can be up to 8TB
I can assure you that you will hit a maximum program size that WIN 10 will allow -- it exists as I found it and Steve then explained what the problem was -- there is a limit and it is not 8TB that I found. I cannot remember the exact limit but it is discussed on a post in this forum, but more than 25000 by 25000 dense array - ie every cell with a double - stops the program and you get a physical space or something error - this is why Pardiso and sparse array packing allows much bigger theoretical arrays
I would agree with Jim, he has a lot of experience -- I was shown to allocate the arrays as a group together and then set to all zero at the start of the program by the people from here that helped with my earlier program - it works a treat with my Monte Carlo program and if it is going to crash will crash at the allocation or zeroing
The article is listing Windows' hard limits on the various types of data. Allocatable arrays are one type of dynamic data, but the article does not say that an allocatable array can be 8TB. The point is that Windows has different limits for static, stack and dynamic data.
The actual limit you will reach is indeed constrained by available pagefile space. After all, the data has to live somewhere...
Computer memory is 128 GB, paging (although I don't know why I'd need it) set to 100GB. It is set in the drive I'm compiling in, not the C: drive. The current 'large' data file I'm using is only 252 MB. There are some other large allocatable files as well but none larger than this one. The program runs for a while, then dies...it seems to be when is is trying to reference another allocatable data set. It only fails if I have loaded the 'large' set. If it were c code, I'd assume a memory leak.
I have had some trouble which I am now pretty sure stems from the Windows memory cache -- failure depends on what version of the code (by the same name) that I ran earlier. Seems to work ok is I clear the memory cache but fails if I run the same problematic code twice. I wonder if having several large arrays in the heap are confusing someone, perhaps in the memory cache; which I assume is completely under OS control. Is there a way in Fortran to clear what I am now clearing in RamMap? I've assumed that allocating arrays would defeat the OS memory cache.