Best Solution Strategy for Inversion of Large Matrix in Intel Fortran?

Nan_Deng · ‎06-15-2010

I have some pretty big matrix inversion problem that I need help. The matrix is dense complex symmetric, and a test size is about 30,000 x 30,000 (ultimately it may reach 150,000 x 150,000), so storage of half matrix would take about 7GB, and solution time is pretty long. Using MKL routines would take too much RAM usage, but a brute force use of Pardiso would need to solve the equation 30,000 times (I wonder if in this case the parallel feature of Pardiso would help. I have 12-core with 96-GB RAM). Do I need to use out-of-core feature of Pardiso? Do I need to maximize the virtual memory? Any shared experiences and suggestions would be gratly appreciated.

mecej4 · ‎06-15-2010

The standard comment on this topic: are you really sure that you have given enough thought to the issues before making the decision to form the inverse matrix explicitly?

What will you do with the inverse, should you succeed in computing it?

Should it turn out that your objectives are better served by something other than computing the inverse, your questions on how to find the inverse using Pardiso become moot.

Nan_Deng · ‎06-15-2010

Yes. Explicit inversion of the matrix is absolutely necessary in forming the overall global equations. I understand your comment, but we have workedin this aspect for thelast 20 years and we haven't found a better alternative. So the question is: Would you mind to share your experience on doing the matrix inversion?

Artem_V_Intel · ‎06-16-2010

Hello,

Please take a look at the matrix inversion routines from MKL LAPACK. csytri() or zsytri() should be suitable for you. You may find description of this routines in the MKL Manual.

Best regards,
Artem

Nan_Deng · ‎06-16-2010

Thanks for your message. Using LAPACK routines would be simple and straightforward, but there are a lot of uncertainties once the matrix get large. Let me clarify my questions:

- Do the LAPACK routines, say, zsytri(), have the parallel capability implemented?Based on my numerical experience, a singlecore, even the most powerful one, would take about 25-30 hrs to do the number crunching for the 30,000x30,000matrix, assuming everyting is in-core. So aparallel capability would certainly be welcomed.

- How muchis the RAM memory requirement? The upper triangle of the matrix alone is about 7GB, and LAPACK didn't give details about internal memory requirement for the routines (work array, swap array, index array, etc.)

- Is there anyway to save the intermediate results? Say after factorization of the matrix, I may want to save the results for later use with different r.h.s. Is it possible?

- For PARDISO, I was attracted by theroutine because in its documentation it is said that the parallel mechanism is implemented and can reach 7 times faster for a 8-core machine (though I don't know what type of matrix and under what conditions), and the documentation mentioned thatin the OOC option all factors are saved on the disk to reduce RAM requirement, these are extremely attractive features,but I understandthat PARDISO is not designedfor full matrix inversion, so I am not sure whether these feature still apply if I use PARDISO for the matrix inversion purpose.

My questions should be better asked as: (1) Does anyone performed matrix inversion study with different routines in the MKL library, including LAPACK's general purpose routines (you mentioned) and the PARDISO?If yes, whatare the conclusions about execution speed (in terms of calendar time) and RAM usage? Any advantage/disadvantage in using either routines? (2) Do all the advantages of PARDISO still apply when it is used for matrixinversion purpose? (3) Ways to save intermediate results from both LAPACK routinesand PARDISO calling sequences?

I'd appreciate any lead and shared experiences. Did anyone do large matrix inversion using LAPACK directly? (Say a matrix size >10,000x10,000) Did anyone use PARDISO for matrix inversion purpose? What's your experience?Any suggestions? Any tricky pointthat one should avoid?Do you think there is another way to do it better other than what I listed here? Please help.

anthonyrichards · ‎06-16-2010

You mention that your matrix is 'Complex'. Does that mean complex numbers? Also, what precision are you hoping for - single or double?

Nan_Deng · ‎06-16-2010

Double precision. All incomplex numbers. Every term in matrix is 16 bytes.

rrobinson2 · ‎06-16-2010

Have a look at ScaLapack routines, which as I recall are parallelized. Its been a while sinceI used them
but a 10,000 x 10,000 martix inversion (complex, double precision) took only a minute or so (I think) on
a machine with 32 processors, 128 GB ram. But I have heard some people say Paradiso is the way to go.