Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

## problem with mkl pardiso performances

Beginner
514 Views

Hello,

i'm using Pardiso to solve a real unsymmetrical problem with very large matrix
( n=40 000 ), 3% non-zeros (45 986 096) and 1 rhs.

I'm quite disappointed about the times of calculation : on my computer (8 CPUs, 16432032 KB total memory, 1595 MHz), it takes 35 min. Does it seem normal to you?
I haven't found any comparison of performance related to the size of the matrix...

Maybe my iparms are not optimized for my problem :
iparm[0] = 1; /* No solver default */
iparm[1] = 2; /* Fill-in reordering from METIS */
iparm[2] = 8;
iparm[3] = 31; /* CGS */
iparm[4] = 0; /* No user fill-in reducing permutation */
iparm[5] = 0; /* Write solution into x */
iparm[6] = 0; /* Not in use */
iparm[7] = 0; /* Max numbers of iterative refinement steps */
iparm[8] = 0; /* Not in use */
iparm[9] = 13; /* Perturb the pivot elements with 1E-13 */
iparm[10] = 1; /* Use nonsymmetric permutation and scaling MPS */
iparm[11] = 0; /* Not in use */
iparm[12] = 0; /* Not in use */
iparm[13] = 0; /* Output: Number of perturbed pivots */
iparm[14] = 0; /* Not in use */
iparm[15] = 0; /* Not in use */
iparm[16] = 0; /* Not in use */
iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
iparm[18] = -1; /* Output: Mflops for LU factorization */
iparm[19] = 0; /* Output: Numbers of CG Iterations */
iparm[31] = 1; /* iterative solver*/
iparm[60] = 2; /* Out-of-Core resolution */

Yannick

p.s : I don't know where to find the version of pardiso ( is not in mkl_pardiso.h)

7 Replies
Moderator
514 Views
Yannick, your case is pretty big and you works in Out-Of-Core mode with is non-threaded and actually this time is looks reasonable for that case. what is the size of RAM on your system? and what CPU type you use? regarding pardiso version - in the latest version on mkl - pardiso print the version in the case if msglvl == 1 or you can find the version on mkl into ..\Documentation\mklsupport.txt --Gennady
Beginner
514 Views
Thanks for your reply The CPUs are : Bi-pro, Quad Core Intel Xeon E5335, 2x4MB Cache, 2.0GHz, 1333MHZ, the size of my RAM is 16 432 032 KB, and the Package ID of mkl is : l_mkl_p_10.0.011 I turned off Out-Of-Core mode, but it's the same performance.
Moderator
514 Views
ok, thanks. I see you are working on the modest CPU but the mkl version is pretty aged. i would say that we did many improvements since 10.0 especially with OOC mode. Can you evaluate the latest 11.0? it's free for 30 days.
Employee
514 Views
Hi Antonie, Could you provide pardiso output by msglvl=1? It could help us to understand reason of bad performance of your testcase. In your iparm I see only one strange point for me - what the reason of setting iparm[3]=31? With best regards, Alexander Kalinkin
Beginner
514 Views
iparm[31] is just a try; I tried iparm[3]=31 and iparm[3]=0, It doesn't influence a lot on performance ================ PARDISO: solving a real nonsymmetric system ================ Summary PARDISO: ( reorder to reorder ) ================ Times: ====== Time fulladj: 2.646652 s Time reorder: 14.776046 s Time symbfct: 5.390644 s Time parlist: 0.193419 s Time malloc : -0.334824 s Time total : 37.119933 s total - sum: 14.447996 s Statistics: =========== < Parallel Direct Factorization with #processors: > 8 < Hybrid Solver PARDISO with CGS/CG Iteration > < Linear system Ax = b> #equations: 40000 #non-zeros in A: 45986096 non-zeros in A (%): 2.874131 #right-hand sides: 1 < Factors L and U > #columns for each panel: 72 #independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> #supernodes: 3069 size of largest supernode: 19757 number of nonzeros in L 462436285 number of nonzeros in U 460178265 number of nonzeros in L+U 922614550 Reordering completed ... Number of nonzeros in factors = 922614550 Number of factorization MFLOPS = 13695630 ================ PARDISO: solving a real nonsymmetric system ================ Summary PARDISO: ( factorize to factorize ) ================ Times: ====== Time A to LU: 0.000000 s Time numfct : 2071.496263 s Time malloc : -0.000183 s Time total : 2071.496571 s total - sum: 0.000491 s Statistics: =========== < Parallel Direct Factorization with #processors: > 8 < Hybrid Solver PARDISO with CGS/CG Iteration > < Linear system Ax = b> #equations: 40000 #non-zeros in A: 45986096 non-zeros in A (%): 2.874131 #right-hand sides: 1 < Factors L and U > #columns for each panel: 72 #independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> #supernodes: 3069 size of largest supernode: 19757 number of nonzeros in L 462436285 number of nonzeros in U 460178265 number of nonzeros in L+U 922614550 gflop for the numerical factorization: 13695.630000 gflop/s for the numerical factorization: 6.611467 Factorization completed ... ================ PARDISO: solving a real nonsymmetric system ================ Summary PARDISO: ( solve to solve ) ================ Times: ====== Time cgs : 6.895049 s cgx iterations 1 Time malloc : -0.000012 s Time total : 6.895151 s total - sum: 0.000113 s Statistics: =========== < Parallel Direct Factorization with #processors: > 8 < Hybrid Solver PARDISO with CGS/CG Iteration > < Linear system Ax = b> #equations: 40000 #non-zeros in A: 45986096 non-zeros in A (%): 2.874131 #right-hand sides: 1 < Factors L and U > #columns for each panel: 72 #independent subgraphs: 0 < Preprocessing with state of the art partitioning metis> #supernodes: 3069 size of largest supernode: 19757 number of nonzeros in L 462436285 number of nonzeros in U 460178265 number of nonzeros in L+U 922614550 gflop for the numerical factorization: 13695.630000 gflop/s for the numerical factorization: 6.611467 Solve completed ... ===========
Employee
514 Views
Hi Antonie, Your output a bit confused me - the factorized matrix is almost dense! Could you send this matrix to me (for example in private thread) to understand such situation arose? With best regards, Alexander Kalinkin
Beginner
514 Views
The file weighs 790Mo... How can i give it to you?