Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29253 Discussions

question about ram requirment for large matrices

johncolona
Beginner
829 Views
Hi

I am a social science student with no prior knowledge in computer science. I need to handle very large matrices in my work but find out I don't have sufficient ram in my computer. How can I figure out how much ram do I need if I need to operate several matrices of size like x(39,19,100,35,39*19*19)?

Thank you very much for your kind attention.
0 Kudos
8 Replies
jimdempseyatthecove
Honored Contributor III
829 Views
x(39,19,100,35,39*19*19) has
39*19*100*35*39*19*19 = 36,513,886,500 cells
If your data are REAL(4) then multiply this by 4 (146,055,546,000 bytes = 146GB)
If your data are REAL(8) then multiply this by 8 ( 292GB)

That is just for this one array.

Depending on how you manipulate this array, and the amount of work in the manipulation, this array may not necessarily be required to reside in physical RAM. Instead it may be able to be manipulated efficient (enough) as Virtual Memeory - *** assuming your Operating System permits 150GB to 300GB of virtual memory within an application.

Few high memory capacity server motherboards can handle up to 128GB of RAM.
Some of these can be interconnected - 2 for 256GB RAM. So a single SMP system may have problems holding your date in physical RAM. Virtual Memory is a different story.

You may be able to write your program using OpenMPI to run on a cluster of systems with a sum total exceeding your requirements, but your programming difficulty will increase.

What data do you have that is that large? And what type of processing will you do on this data?
Is the data in this arrayconvertable into a smaller storage format? e.g. could it be held in 1 or 2 bytes?
Is the array sparse? (not fully filled)

Jim Dempsey

0 Kudos
Jugoslav_Dujic
Valued Contributor II
829 Views
To give you a comparison, your everyday WinXP or 32-bit Vista/W7 permits only about 2GB of RAM per process; even if you have 100GB RAM in your computer, you couldn't handle more than 2GB at a time. A 64-bit system (W7 64-bit) can theoretically handle such amount, but it would possibly grind it to a halt.

I suppose that your problem needs reframing: I seriously doubt that you're really going to use every element of your matrix. What kind of data are those? Judging on your background, I'll venture a guess that it's some kind of population statistics.

In particular, the figures "39", "19", "39*19*19" sound suspicious to me. It looks as if you have 39 groups A, 19 groups B, and some kind of relationship between those. I bet that, in programming terms, the relationship can be expressed in a different way than a matrix, which would be far more efficient and memory-friendly. Matrix is not a suitable tool for everything. However, we would need to know far more data about your problem to able to tell something intelligent (or dumb).
0 Kudos
bendel_boy1
Beginner
829 Views
In the social sciences it is more common to know how to use efficiently a program like SSSP than to program your own tests.

If this is a class exercise you are probably approaching it the wrong way; if you describe what tests you are trying to implement you may get help here.

If this is data that you need to process, then 'roll your own' may not be appropriate; again, describe the tests, and use Google or similar and you may find free/open source implementations that would help you.
0 Kudos
johncolona
Beginner
829 Views
Thanks for all the replies. Actually the problem is a discrete state discrete time dynammic programming problem, the matrices store the relevant value and policy functions. The numbers represent the states of the problem

The matrices are therefore not sparse. Openmp helps a lot in speeding up the code, but insufficient memory is a major issue. Allocate arrays does not help either.
0 Kudos
bendel_boy1
Beginner
829 Views
This problem used to be handled by using disk files. This will slow down the processing by orders of magnitude.
0 Kudos
mecej4
Honored Contributor III
829 Views
Others have given you excellent advice, but here are two other aspects that they have not covered.

1. A matrix is a two-dimensional array (or, as special cases, a one- or zero-dimensional array) with _required_ additional mathematical properties and defined operators (add, multiply, transpose, etc.)

In other words, every matrix is an n-dimensional (n <= 2) array composed of complex, real or integer elements. However, not every two-dimensional array composed of such elements is a matrix.

It is possible that there are sections of your multi-dimensional arrays that may be regarded as matrices. In that case, only those sections that participate in matrix operations need to be in RAM.

2. There is another aspect of computer arithmetic that one should learn before dealing with large scale computations: computer arithmetic does not agree all the time with classical (as in Archimedes) arithmetic. For example, (4.0/3.0-1.0)*3.0-1.0 does not necessarily equal 0.0. Likewise, if one takes a large non-singular matrix A, and forms the product A x inv(A), the result is not necessarily well approximated by I, the identity matrix.


0 Kudos
anthonyrichards
New Contributor III
829 Views
Reduce each of the 7 numbers in x(39,19,100,35,39*19*19) by a factor 2 and you reduce the storage by a factor 2**7=128, which is more likely to be accommodated, at a push.
0 Kudos
jimdempseyatthecove
Honored Contributor III
829 Views
>>Openmp helps a lot in speeding up the code, but insufficient memory is a major issue

Please keep in mind thatyou can, in many cases,organize the access to the data such that it minimizes page faults (swapping) in a virtual memory situation where the virtual memory exceeds the physical RAM.

Old time programmers like myself, are familiar with manipulating large data sets (as compared with available RAM) in environments with low random access I/O performance but with good sequential I/O characteristics. e.g. small RAM system with reel-to-reelmag tape. Reorganization of data access patterns yield several orders of magnitude better performance. This programming art finds little use today due to systems "generally" having more RAM than the application requires. In the exceptional situation like yours, paying attention to data access patterns, will yield similar results.

Additionally, when you can program your application take advantage to a parallel_pipeline type of programming, where the I/O pipesruns at a different (higher)priority from the computational pipes, then the programming performance approaches that of the same problem with sufficient RAM to run the computation in parallel without I/O. (very small amount of time to issue read/write and fairly small memory contention during DMA with disk controller).

The advantage of a well implemented parallel_pipeline is the application seldom waits for I/O (e.g. only for the initial read (or first few reads)).

Jim Dempsey
0 Kudos
Reply