Dynamic Array Size Increases

JohnT1 · ‎12-02-2009

I'm fairly new to using allocatable arrays in Fortran, and to dynamic memory management in general.

I'm currently writing a code that contains arrays of a known shapethat do not have a predetermined size (no dimensions are input), so I have made them allocatable for greatest flexibility.

My hang-up is this: the final size of these arrays is not known until the program is complete - the arrays are filled in as the code "marches" to achieve a desired grid resolution, so the total number of grid points changes based on the particular problem.

I would like towrite codethat dynamically increases the size of these arrays as the program calculations are made - that is, the array size is increased as-needed.

To summarize my questions:

What is the best way to perform dynamic increases in array size, while retaining all existing information? Do I have to deallocate-reallocate, using a temporary array to store the existing information?

Is it a good idea to perform increases in size at every step (in increments of 1), or is it better to add larger chunks (in increments of say 100) to the array each time, for the sake of keeping memory contiguous, etc.? The number of stepsshould beon the order of10,000 depending on the problem.

Finally, how will performing many deallocation-reallocations affect program performance? Am I better off allocating these arrays to some maximum-expected value at the beginning of the code.

Thank you very muchfor your help and time.

jimdempseyatthecove · ‎12-02-2009

Does the array build during an input phase, and then remain static for a long run time?
If so, consider building to a file, obtaining the eventual final array size, thenthen read the file, back into an array that is allocated once.

Reallocating in small incremental steps tends to fragment memory. Search MSDN for LFH (low fragmentation heap). Enabling that might be beneficial for reallocating in small incremental steps, however, if you are allocating multiple items as you creep up in size, your heap will fragment regardless of LFH. I suggest derriving a metric to determine a guess at eventual array size (guess larger than required). something like fn(InputFileSize). You define what is in fn(). When the size guessed at is determined as being too small, then report that information to a log file for future use. Then realloc the array using a larger step. Are you running on x32 or x64?

Jim Dempsey

Steven_L_Intel1 · ‎12-02-2009

Definitely allocate large chunks - at least 10MB at a time, I would say. Each time you enlarge the array, you have to allocate a new one of the bigger size and copy the existing elements to it. You can use the MOVE_ALLOC intrinsic to do a reallocate with only one data copy.

Your application might benefit from keeping a linked list of "chunks", so you don't have to reallocate or copy at all - just allocate a new chunk when necessary and add it to the list. It does somewhat complicate traversing the list - you'll have to decide what makes sense for your application.

JohnT1 · ‎12-02-2009

Jim - thanks for your feedback.

Currently, the program can be boiled down to a DO-loopthat stores itsresults in this array. So, each time the loop completes,values get added to the array.These resultswill then beused later, but this portion of the code hasyet to be written.

Right now, your write-to-file approach makes a lotof sensebecause the arrayis beingfilled in and then simplywritten tofile. For the future code that will use the array contents, I'm curious as to how writing to file and then reading everything in again would affect performance, relative to everything remaining in memory. (I'm assuming that I should use unformatted files for this purpose?)

Given the potential for memory fragmentation, I will probably avoid the approach involving deallocation and reallocation. I have several metrics in mind that I can use to predict eventual array size, so I can allocate the arrays to a larger-than-expected value at the top of the program if I need to keep the array in memory.

I'm currently developing on x32but will be running the eventual code on x64.

Thanks again for your help.

JohnT1 · ‎12-02-2009

Steve - thanks for your reply!

I'll definitely allocate in large chunks if I go the MOVE_ALLOC route.

Sorry for my naivete - do you mind elaborating a little bit on your second approach? In particular, where does each new chunk get allocated/stored? I'm interested and I think I can manage the bookkeeping.

jimdempseyatthecove · ‎12-03-2009

>>I'm currently developing on x32but will be running the eventual code on x64.

I will guess then that your development system is x32 and the production system is x64. In this case you will need the finalized application, together with complete test data to runas x32.

Write-to-file followed by immediately Read-from-file should be relatively fast.

You also can experiment with:

Do everything you do with write-to-file except for the write.
In the process of doing this, tally up the memory requirements
Now, allocate
Then read into memory

The above requires you read/obtain the input data twice.

As to which method works best, this depends on the "obtain" costs (direct read or pull from database) plus conversion costs (ASCII to internal).

The alternative, suggested in earlier post, using information about the input data (e.g. file size or record count) produce an estimation function fn() that guesses at a workable array size. Then allocate to that size. Write your code such that it will work with an underpopulated array. IOW size(array) does not tell you the number of inuse items in the array. You will need to carry a count of elements in use in the array (or index to last/next element).

You may be pleasantly surprised that the write-to-file is not all that expensive.

The Write-to-file (your internal binary format) will have use later should you wish to add a checkpointing capability to the application. e.g. the app is very long running and every hour or day... you checkpoint the application state. Should the program ab-end or should a "significant event" occure, you will have the checkpoint data such that you can resume the program or investigate the "significant event".

Jim Dempsey