Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

## Pardiso Out of Core

Beginner
15,562 Views
Is PARDISO an out of core solver?. Then, can it access an out of core stored matrix?
Where can I find information on how the matrix has to be stored for out of core solution?
Is there other function in MKL that can solve a linear symmetrix system out of core?
Thanks
25 Replies
Employee
15,286 Views
Quoting - atpq2680
Is PARDISO an out of core solver?. Then, can it access an out of core stored matrix?
Where can I find information on how the matrix has to be stored for out of core solution?
Is there other function in MKL that can solve a linear symmetrix system out of core?
Thanks

Hi!

PARDISO has Out-Of-Core (OOC) mode, butit is assumed thatinput matrix should fit in RAM. Inthis mode PARDISO stores on disk only LU factors and some working arrays.

BTW, what kind of matrix do you solve? Dense or sparse? Please let us know the parameters of solving task.

Best regards,
Sergey
Beginner
15,286 Views

Hi!

PARDISO has Out-Of-Core (OOC) mode, butit is assumed thatinput matrix should fit in RAM. Inthis mode PARDISO stores on disk only LU factors and some working arrays.

BTW, what kind of matrix do you solve? Dense or sparse? Please let us know the parameters of solving task.

Best regards,
Sergey

How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.
New Contributor I
15,286 Views
Quoting - atpq2680

How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.

Hi,
lets me try to clarify this question. Regular (InCore) version of PARDISO uses RAM for solving SLAE and doesn't use hard disk. Very often, the input matrix is very sparse, but LU factors are not so sparse. As result, these factors don't place in RAM and this problem cannot be solved by regular version of PARDISO. To handle with such systems, we developed PARDISO version, which uses hard-disk for storing LU-factors. This version called Out-Of-Core PARDISO. If you solve dense matrix, please use LAPACK routines. If you have sparse matrix, which doesn't placed in RAM, you can submit feature request against OOC PARDISO.

Beginner
15,286 Views
Quoting - atpq2680

How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.

I wanted to do that too once. But I found that converting from skyline to the MKL sparse format made it so small that it could fit in RAM. Skyline is pretty huge for spare matrices in my application (finite element analysis). But PARDISO OOC solved the RAM problem. But yea for genuinely denser or bigger matrices MKL can't help.

Moderator
15,286 Views
Quoting - kallog
Quoting - atpq2680

How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.

I wanted to do that too once. But I found that converting from skyline to the MKL sparse format made it so small that it could fit in RAM. Skyline is pretty huge for spare matrices in my application (finite element analysis). But PARDISO OOC solved the RAM problem. But yea for genuinely denser or bigger matrices MKL can't help.

kallog,
if you are really interesting to solve very big matrices like you wrote "But yea for genuinely denser or bigger matrices MKL can't help.", then could you please submit the Feature Request at

If you do not have account to access this channel, please complete your account registration at https://registrationcenter.intel.com/

Beginner
15,286 Views

I found that MKL_PARDISO_OOC_MAX_CORE_SIZE must exceed the value reported by iparm(15) after phase 11 (only reordering and symbolic factorisation). iparm(15) reports "peak memory symbolic factorization". In my case, the matrix in csr format requires 232 MB, whereas iparm(15) reports 878 MB.

As explained, the LU factors appear to be stored on disk, since the size of the ooc_temp files matches 8 bytes/entry * number of nonzeros as mentioned by the statistics obtained via msglvl=1.

For a larger matrix, 248 MB in size, the program crashes at phase 11 with error -2. iparm(15) reports 939 MB:

Peak memory symbolic factorization (MB) = 939
Permanent memory symbolic factorization (MB) = 0
Memory numerical factorization and solution (MB) = 1668
total peak memory solver consumption (MB) = 1668

The program closes with:

ooc_max_core_size got by Env = 2000
The file .\pardiso_ooc.cfg was not opened
*** error PARDISO ( insufficient_memory) error_num= -800
*** error pardiso (memory allocation) STRUC_FI, size to allocate: 362146752 bytes
total memory wanted here: 962126 kbyte
symbolic (max): 962126 symbolic (permanent): 2 real(incl. 1 factor):

================ PARDISO: solving a symm. posit. def. system ================

Summary PARDISO: ( reorder to reorder )
================

Times:
======

Time reorder: 2.700492 s
Time symbfct: 2.213089 s
Time malloc : 0.484451 s
Time total : 5.707339 s total - sum: 0.000227 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 737658
#non-zeros in A: 20402451
non-zeros in A (%): 0.003749
#right-hand sides: 1

< Factors L and U >
#columns for each panel: 10
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 101573
size of largest supernode: 2514
number of nonzeros in L 217919528
number of nonzeros in U 1
number of nonzeros in L+U 217919529

I am not sure why the error happens; I use 32-bit windows Vista. Sysinternal's Process Explorer tells that during assembly, the virtual memory size 2142 MB, and Workset similar. After assembly, and deallocating the memory, just before starting phase 11, the Virtual memory size is still(?) 2139 MB, but the working set 869 MB. After that, pardiso is invoked and stops with the above error.

Do you have any clues for me on how to proceed?

New Contributor I
15,286 Views

Hi!

What version of MKL do you use? Could you print the iparm(64)after step=11and provide us with result?

Beginner
15,286 Views
The version I use is 10.0.1.015.
iparm(64) returns 0.

To avoid constructing the matrix multiple times, the program writes the csr matrix to file. Bootstrapping the program by reading the matrix from file, Procexp reports a Virtual Size of 938 MB and a working set of 827 MB. The program succeeds in doing the calculation, and is using a .lnz file of 1.743E+9 bytes and an .idx file of 98E+6 bytes. The size neatly matches 8 * 217.9E6 for the nonzeros.

Re-running the program from scratch reproduces the error, so it seems that the problem size is on the edge of feasibility.

Beginner
15,287 Views
I am thinking of strategies to solve large problems:

1. in-program, as is done now. This means that part of the memory is occupied by the problem data and can not be used by pardiso.
2. solving Ku=f by writing K and f to disk, and invoking a stand-alone solver; this maximizes the memory available to pardiso.
3. doing more out-of-core. This will require more programming effort. Genny Fedorov suggests to submit a feature request (#5).

Switching to a more recent MKL version will give more room: http://software.intel.com/en-us/articles/pardiso-use-half-the-memory-now/

Currently, I only employ strategy 1. Is it possible to predict how much memory is required?

Moderator
15,287 Views
Is it possible to predict how much memory is required?
it impossible exactly to predict how many memory is required to calculation,because of it dependents on combination of sparsity pattern and type of input matrix.
Moderator
15,287 Views
for whom who interested - the parameter iparm(64) is not documented parameter for this moment and we used it for our internally needs. Currently it returns the internal version of PARDISO. Available since 10.1 so why with 10.0 version , iparm(64) ==0.
For example - with 10.2 u4, it will report: iparm[63] = 102000110

Beginner
15,287 Views
I expected that that would be the case, but was not sure. Thanks for the confirmation.

In a previous post (#6), the program finished with an error:

*** error pardiso (memory allocation) STRUC_FI, size to allocate: 362146752 bytes
total memory wanted here: 962126 kbyte
symbolic (max): 962126 symbolic (permanent): 2 real(incl. 1 factor)
Do I understand correctly that the program desires 962 MB, and 362 MB is available, in other words, that the program would work given the missing 600 MB of storage space? Or is the memory allocated multiple times, as required, during the solution, in other words, would the program succeed in this memory allocation only to fail at the next attempt at memory allocation?

In the first case, I could test for the difference in desired and available memory, see if it fits in the stand-alone solver strategy, and tell the user to invoke that. If not, I can tell the user that is model is to large and its size must be reduced.

Beginner
15,287 Views
Is the regular pardiso which you can download from the pardiso-project homepage also an out of core solver, or is it in core and only intel provides a specialized version which works ooc?

thanks
Moderator
15,287 Views
No special version - the regular version can works in 3 modes (in-core,hybryd and ooc).
It will depend on the value of iparm(60). Please, see for the detail in the reference manual.
Beginner
15,287 Views

I read your response with great interest. You wrote "if you are really interesting to solve very big matrices like you wrote "But yea for genuinely denser or bigger matrices MKL can't help.",then could you please submit the Feature Request at <https://premier.intel.com/>"

Does this mean that you do have some separate out-of-core solver that can be accessed(with extra fund, I guess?) Is this on Intel's product list or Intel wants to do it on a project-specific basis?

I do have some very big matrix problem that I need to solve, currently with a 30,000x30,000 dense matrix to invert, and a sparse matrix in the order of 450,000 to solve. If submit the request to premier.intel.com can help me solve the problem, I certainly would do it. (I do have an account at premier). Please let me know.

Best regards,
Nan Deng

Beginner
15,287 Views
Hello,

what might this error be? Is it really to do with memory? I am able to run the same matrix (and in fact even larger matrices with more than 1.4 x 10^9 elements, using the memory we have (64G) in ooc mode (and this particular matrix even in in-core mode). The difference here is that my inputs (matrix data - input values, ia, ja and rhs values etc.) are memory-mapped. I am able to run smaller matrices with my inputs memory-mapped in the same way (i.e. with the same executable), but when I get to this size, I get this memory error - which really can't be about memory, since now there should be even more memory available given that my inputs are now memory mapped (and without the memory mapping I am able to run this and larger - as mentioned before - matrices). The negative numbers in the message from PARDISO seem to indicate some sort of overflow issue, though I haven't reached the limits of an int yet (next on my agenda is to move to pardiso_64 and work with a matrix that is bigger than 2x10^9 in terms of number of non-zeros).

There is close to 62G of free memory available on the machine when this happens, this is not a real memory issue - an incorrect error message?

Unsuccessful run, message:

gcc -o pardiso pardiso_sym_c.c -I/home/sudha.rangan.ctr/intel-beta/mkl/include -L/home/sudha.rangan.ctr/intel-beta/mkl/lib/include -L/home/sudha.rangan.ctr/intel-beta/mkl/lib/intel64 -L/home/sudha.rangan.ctr/intel-beta/lib/intel64 -liomp5 -lmkl_solver_lp64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -lpthread -lm
pardiso_sym_c.c: In function :
pardiso_sym_c.c:91: warning: incompatible implicit declaration of built-in function
pardiso_sym_c.c:92: warning: incompatible implicit declaration of built-in function
[sudha.rangan.ctr@slot04 pardiso]\$ !./pardiso
./pardiso matrix512b
You entered matrix512b
Nonzero elements: 361304064 Size (number of equations): 884736
first value = -4422846.000000
first ia index = 0
first ja index = 0
first rhs = -0.121058
a0: -4.422846e+06 a_end: -9.250865e+03
ia 0: 0 ai end: 361303848
ja 0: 0 ja end: 884735
b 0: -0.121058 b end: -0.004208
ooc_max_core_size got by Env = 54000
The file ./pardiso_ooc.cfg was not opened
*** Error in PARDISO ( insufficient_memory) error_num= -206
*** Error in PARDISO memory allocation: FACT_ADR, size to allocate: -1404534784 bytes
total memory wanted here: -1364687 kbyte
symbolic (max): -1364687 symbolic (permanent): 0
real(including 1 factor): 0

Peak Mem needed... 0
ERROR during symbolic factorization: -2[sudha.rangan.ctr@slot04 pardiso]\$ vi pardiso_sym_c.c

I can send/post more information (output of successful runs, code etc.). I am using 3.0 beta.

Thanks and any help appreciated. We do have an older purchased version (2.x) of the mkl, but am currently using an evaluation copy of 10.3.0-beta.

Sudha Rangan
Beginner
15,287 Views
Hello - I think I see the answer to my question. It is something to do with C-style indexing. Since I was using the mmapped files I decided to use C-style indexing and leave my input the way I had generated it (and not use the FORTRAN style). For some reason that's causing a problem.

Thanks,
Sudha
Beginner
15,286 Views
What MKL version are you using?
Beginner
15,286 Views
Hello, I'm using 10.3.0.050 (beta). I got over the last problem as indicated. I then used the 64-bit interface (compiled with the MKL_ILP64 flag and sent in MKL_INTs and MKL_INT* to the pardiso call) for an even larger matrix and then it died when iparm[1] (C-style - i.e. Fill-in reducing ordering for the input matrix) was 2 ( METIS), so I changed things to use 0(minimum degree) and it ran producing seemingly correct results. Pardiso is looking very promising and we intend to use it for problems of much, much greater size (10^10 unknowns). While given that the matrix is being held in memory by pardiso (couldn't the input matrix we supply be re-used? The input matrix I supply is memory mapped), I need to figure out how to become an intel premier customer and have that be OOC too (other than the factors). While we may still be able to use pardiso as is on our scope of problem on super-computers, it would be nice to be able to use it on machines with 64GB-256GB of memory for our problem size.

I also tried running things with an even larger problem size and I keep getting the -180 error during reordering. I have looked at other threads and saw that this may be to do with linking to incorrect libraries, but I have tried every possible combination and still get this error. I also tried both 10.2.6.038 (With 10.2.6.038 I get the -800 error with insufficient memory) and 10.3.0 beta. Would anyone have an idea what this might be? Could it be to do with the large sizes? Pardiso really can't handle these large sizes (It did handle half these sizes (in numRows and NNZs)? Again, my input matrix is memory-mapped, I am using OOC (set param[59] (C-style) to 2) and have 63+GB of memory available). My matrix occupies only about 46G of space and since I have memory mapped the matrix, I would expect that pardiso has enough memory to hold its copy of the matrix in memory - since the factors are OOC).

You entered matrix4096bf
Nonzero elements: 2890432512 Size (number of equations): 7077888
first value = -4422846.000000
first ia index = 1
, 2nd ia index = 217first ja index = 1
first rhs = -0.121058
a0: -4.422846e+06 a_end: -9.250865e+03
ia 0: 1 ai end: 2890432297
ja 0: 1 ja end: 7077888
b 0: -0.121058 b end: -0.004208
ooc_max_core_size got by Env = 256000
The file ./pardiso_ooc.cfg was not opened
*** Error in PARDISO ( reordering_phase) error_num= -180
*** error PARDISO: reordering, symb. factorization

================ PARDISO: solving a real struct. sym. system ================

Summary PARDISO: ( reorder to reorder )
================

Times:
======
Time reorder: 5.503694 s
Time symbfct: 23.955411 s
Time malloc : 269.677817 s
Time total : 602.334729 s total - sum: 19.222212 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 8
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b>
#equations: 7077888
#non-zeros in A: 2890432511
non-zeros in A (%): 0.005770

#right-hand sides: 1

< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
#columns for each panel: 72
#independent subgraphs: 0
#supernodes: 125067

size of largest supernode: 810
number of nonzeros in L 5062328928
number of nonzeros in U 4636596168
number of nonzeros in L+U 9698925096

ERROR during symbolic factorization: -3

This looks like it really might be something to do with the large size in terms of non-zeros? Even though I'm using the 64-bit interface?

If an intel engineer is reading this, would love to get a response.

Thanks,
Sudha

New Contributor I
14,992 Views
Hello!

This failure is expected in 10.3.0.beta because ILP64 version of METIS was implemented since 10.3.0Gold and 10.2.6. Could you provide us with log of failure of MKL10.2.6 ? How much size of swap does have your system? The fact is that METIS uses additional memory to reorder input matrix. Probably there is no enough memory for it.