- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Where can I find information on how the matrix has to be stored for out of core solution?
Is there other function in MKL that can solve a linear symmetrix system out of core?
Thanks
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Where can I find information on how the matrix has to be stored for out of core solution?
Is there other function in MKL that can solve a linear symmetrix system out of core?
Thanks
Hi!
PARDISO has Out-Of-Core (OOC) mode, butit is assumed thatinput matrix should fit in RAM. Inthis mode PARDISO stores on disk only LU factors and some working arrays.
BTW, what kind of matrix do you solve? Dense or sparse? Please let us know the parameters of solving task.
Best regards,
Sergey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
PARDISO has Out-Of-Core (OOC) mode, butit is assumed thatinput matrix should fit in RAM. Inthis mode PARDISO stores on disk only LU factors and some working arrays.
BTW, what kind of matrix do you solve? Dense or sparse? Please let us know the parameters of solving task.
Best regards,
Sergey
How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.
Hi,
lets me try to clarify this question. Regular (InCore) version of PARDISO uses RAM for solving SLAE and doesn't use hard disk. Very often, the input matrix is very sparse, but LU factors are not so sparse. As result, these factors don't place in RAM and this problem cannot be solved by regular version of PARDISO. To handle with such systems, we developed PARDISO version, which uses hard-disk for storing LU-factors. This version called Out-Of-Core PARDISO. If you solve dense matrix, please use LAPACK routines. If you have sparse matrix, which doesn't placed in RAM, you can submit feature request against OOC PARDISO.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
How can intel claim PARDISO is an out of core solver when you must have the matrix in RAM ? I have an skyline columnwise stored matrix and I can store it in disk in any format supported by PARDISO. I wonder if MKL has a real out of core solver?. Thank you very much for your time.
kallog,
if you are really interesting to solve very big matrices like you wrote "But yea for genuinely denser or bigger matrices MKL can't help.", then could you please submit the Feature Request at <https://premier.intel.com/>.
If you do not have account to access this channel, please complete your account registration at https://registrationcenter.intel.com/
--Gennady
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found that MKL_PARDISO_OOC_MAX_CORE_SIZE must exceed the value reported by iparm(15) after phase 11 (only reordering and symbolic factorisation). iparm(15) reports "peak memory symbolic factorization". In my case, the matrix in csr format requires 232 MB, whereas iparm(15) reports 878 MB.
As explained, the LU factors appear to be stored on disk, since the size of the ooc_temp files matches 8 bytes/entry * number of nonzeros as mentioned by the statistics obtained via msglvl=1.
For a larger matrix, 248 MB in size, the program crashes at phase 11 with error -2. iparm(15) reports 939 MB:
Peak memory symbolic factorization (MB) = 939
Permanent memory symbolic factorization (MB) = 0
Memory numerical factorization and solution (MB) = 1668
total peak memory solver consumption (MB) = 1668
The program closes with:
ooc_max_core_size got by Env = 2000
The file .\pardiso_ooc.cfg was not opened
*** error PARDISO ( insufficient_memory) error_num= -800
*** error pardiso (memory allocation) STRUC_FI, size to allocate: 362146752 bytes
total memory wanted here: 962126 kbyte
symbolic (max): 962126 symbolic (permanent): 2 real(incl. 1 factor):
================ PARDISO: solving a symm. posit. def. system ================
Summary PARDISO: ( reorder to reorder )
================
Times:
======
Time fulladj: 0.309081 s
Time reorder: 2.700492 s
Time symbfct: 2.213089 s
Time malloc : 0.484451 s
Time total : 5.707339 s total - sum: 0.000227 s
Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >
< Linear system Ax = b>
#equations: 737658
#non-zeros in A: 20402451
non-zeros in A (%): 0.003749
#right-hand sides: 1
< Factors L and U >
#columns for each panel: 10
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 101573
size of largest supernode: 2514
number of nonzeros in L 217919528
number of nonzeros in U 1
number of nonzeros in L+U 217919529
I am not sure why the error happens; I use 32-bit windows Vista. Sysinternal's Process Explorer tells that during assembly, the virtual memory size 2142 MB, and Workset similar. After assembly, and deallocating the memory, just before starting phase 11, the Virtual memory size is still(?) 2139 MB, but the working set 869 MB. After that, pardiso is invoked and stops with the above error.
Do you have any clues for me on how to proceed?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
What version of MKL do you use? Could you print the iparm(64)after step=11and provide us with result?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
iparm(64) returns 0.
To avoid constructing the matrix multiple times, the program writes the csr matrix to file. Bootstrapping the program by reading the matrix from file, Procexp reports a Virtual Size of 938 MB and a working set of 827 MB. The program succeeds in doing the calculation, and is using a .lnz file of 1.743E+9 bytes and an .idx file of 98E+6 bytes. The size neatly matches 8 * 217.9E6 for the nonzeros.
Re-running the program from scratch reproduces the error, so it seems that the problem size is on the edge of feasibility.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. in-program, as is done now. This means that part of the memory is occupied by the problem data and can not be used by pardiso.
2. solving Ku=f by writing K and f to disk, and invoking a stand-alone solver; this maximizes the memory available to pardiso.
3. doing more out-of-core. This will require more programming effort. Genny Fedorov suggests to submit a feature request (#5).
Switching to a more recent MKL version will give more room: http://software.intel.com/en-us/articles/pardiso-use-half-the-memory-now/
Currently, I only employ strategy 1. Is it possible to predict how much memory is required?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In a previous post (#6), the program finished with an error:
*** error pardiso (memory allocation) STRUC_FI, size to allocate: 362146752 bytesDo I understand correctly that the program desires 962 MB, and 362 MB is available, in other words, that the program would work given the missing 600 MB of storage space? Or is the memory allocated multiple times, as required, during the solution, in other words, would the program succeed in this memory allocation only to fail at the next attempt at memory allocation?
total memory wanted here: 962126 kbyte
symbolic (max): 962126 symbolic (permanent): 2 real(incl. 1 factor)
In the first case, I could test for the difference in desired and available memory, see if it fits in the stand-alone solver strategy, and tell the user to invoke that. If not, I can tell the user that is model is to large and its size must be reduced.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read your response with great interest. You wrote "if you are really interesting to solve very big matrices like you wrote "But yea for genuinely denser or bigger matrices MKL can't help.",then could you please submit the Feature Request at <https://premier.intel.com/>"
Does this mean that you do have some separate out-of-core solver that can be accessed(with extra fund, I guess?) Is this on Intel's product list or Intel wants to do it on a project-specific basis?
I do have some very big matrix problem that I need to solve, currently with a 30,000x30,000 dense matrix to invert, and a sparse matrix in the order of 450,000 to solve. If submit the request to premier.intel.com can help me solve the problem, I certainly would do it. (I do have an account at premier). Please let me know.
Best regards,
Nan Deng
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
what might this error be? Is it really to do with memory? I am able to run the same matrix (and in fact even larger matrices with more than 1.4 x 10^9 elements, using the memory we have (64G) in ooc mode (and this particular matrix even in in-core mode). The difference here is that my inputs (matrix data - input values, ia, ja and rhs values etc.) are memory-mapped. I am able to run smaller matrices with my inputs memory-mapped in the same way (i.e. with the same executable), but when I get to this size, I get this memory error - which really can't be about memory, since now there should be even more memory available given that my inputs are now memory mapped (and without the memory mapping I am able to run this and larger - as mentioned before - matrices). The negative numbers in the message from PARDISO seem to indicate some sort of overflow issue, though I haven't reached the limits of an int yet (next on my agenda is to move to pardiso_64 and work with a matrix that is bigger than 2x10^9 in terms of number of non-zeros).
There is close to 62G of free memory available on the machine when this happens, this is not a real memory issue - an incorrect error message?
Unsuccessful run, message:
gcc -o pardiso pardiso_sym_c.c -I/home/sudha.rangan.ctr/intel-beta/mkl/include -L/home/sudha.rangan.ctr/intel-beta/mkl/lib/include -L/home/sudha.rangan.ctr/intel-beta/mkl/lib/intel64 -L/home/sudha.rangan.ctr/intel-beta/lib/intel64 -liomp5 -lmkl_solver_lp64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -lpthread -lm
pardiso_sym_c.c: In function :
pardiso_sym_c.c:91: warning: incompatible implicit declaration of built-in function
pardiso_sym_c.c:92: warning: incompatible implicit declaration of built-in function
[sudha.rangan.ctr@slot04 pardiso]$ !./pardiso
./pardiso matrix512b
You entered matrix512b
Nonzero elements: 361304064 Size (number of equations): 884736
first value = -4422846.000000
first ia index = 0
first ja index = 0
first rhs = -0.121058
a0: -4.422846e+06 a_end: -9.250865e+03
ia 0: 0 ai end: 361303848
ja 0: 0 ja end: 884735
b 0: -0.121058 b end: -0.004208
ooc_max_core_size got by Env = 54000
The file ./pardiso_ooc.cfg was not opened
*** Error in PARDISO ( insufficient_memory) error_num= -206
*** Error in PARDISO memory allocation: FACT_ADR, size to allocate: -1404534784 bytes
total memory wanted here: -1364687 kbyte
symbolic (max): -1364687 symbolic (permanent): 0
real(including 1 factor): 0
Peak Mem needed... 0
ERROR during symbolic factorization: -2[sudha.rangan.ctr@slot04 pardiso]$ vi pardiso_sym_c.c
I can send/post more information (output of successful runs, code etc.). I am using 3.0 beta.
Thanks and any help appreciated. We do have an older purchased version (2.x) of the mkl, but am currently using an evaluation copy of 10.3.0-beta.
Sudha Rangan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks,
Sudha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I also tried running things with an even larger problem size and I keep getting the -180 error during reordering. I have looked at other threads and saw that this may be to do with linking to incorrect libraries, but I have tried every possible combination and still get this error. I also tried both 10.2.6.038 (With 10.2.6.038 I get the -800 error with insufficient memory) and 10.3.0 beta. Would anyone have an idea what this might be? Could it be to do with the large sizes? Pardiso really can't handle these large sizes (It did handle half these sizes (in numRows and NNZs)? Again, my input matrix is memory-mapped, I am using OOC (set param[59] (C-style) to 2) and have 63+GB of memory available). My matrix occupies only about 46G of space and since I have memory mapped the matrix, I would expect that pardiso has enough memory to hold its copy of the matrix in memory - since the factors are OOC).
You entered matrix4096bf
Nonzero elements: 2890432512 Size (number of equations): 7077888
first value = -4422846.000000
first ia index = 1
, 2nd ia index = 217first ja index = 1
first rhs = -0.121058
a0: -4.422846e+06 a_end: -9.250865e+03
ia 0: 1 ai end: 2890432297
ja 0: 1 ja end: 7077888
b 0: -0.121058 b end: -0.004208
ooc_max_core_size got by Env = 256000
The file ./pardiso_ooc.cfg was not opened
*** Error in PARDISO ( reordering_phase) error_num= -180
*** error PARDISO: reordering, symb. factorization
================ PARDISO: solving a real struct. sym. system ================
Summary PARDISO: ( reorder to reorder )
================
Times:
======
Time fulladj: 283.975595 s
Time reorder: 5.503694 s
Time symbfct: 23.955411 s
Time malloc : 269.677817 s
Time total : 602.334729 s total - sum: 19.222212 s
Statistics:
===========
< Parallel Direct Factorization with #processors: > 8
< Numerical Factorization with BLAS3 and O(n) synchronization >
< Linear system Ax = b>
#equations: 7077888
#non-zeros in A: 2890432511
non-zeros in A (%): 0.005770
#right-hand sides: 1
< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
#columns for each panel: 72
#independent subgraphs: 0
#supernodes: 125067
size of largest supernode: 810
number of nonzeros in L 5062328928
number of nonzeros in U 4636596168
number of nonzeros in L+U 9698925096
ERROR during symbolic factorization: -3
This looks like it really might be something to do with the large size in terms of non-zeros? Even though I'm using the 64-bit interface?
If an intel engineer is reading this, would love to get a response.
Thanks,
Sudha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This failure is expected in 10.3.0.beta because ILP64 version of METIS was implemented since 10.3.0Gold and 10.2.6. Could you provide us with log of failure of MKL10.2.6 ? How much size of swap does have your system? The fact is that METIS uses additional memory to reorder input matrix. Probably there is no enough memory for it.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page