Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Pardiso problem with large equation systems

Meysam_J_
Beginner
536 Views

Hi guys,

I am facing a problem with Pardiso and I really appreciate it if someone could give me a hint on it.

The system I am trying to solve contains 1,016,451 number of equations with  569392326 nonzero elements. I am using my settings in pardiso which looks like the following

    iparm[0] = 1; /* No solver default */
    iparm[1] = 0;

            iparm[3] = 0; /* No iterative-direct algorithm */
            iparm[4] = 0; /* No user fill-in reducing permutation */
            iparm[5] = 0; /* Write solution into x */
            iparm[7] = 1; /* Max numbers of iterative refinement steps */
            iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
            iparm[10] = 0; /* Use nonsymmetric permutation and scaling MPS */
            iparm[12] = 0; /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
            iparm[13] = 0; /* Output: Number of perturbed pivots */
            iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
            iparm[18] = 0; /* disable Output: Mflops for LU factorization */
            iparm[19] = 0; /* Output: Numbers of CG Iterations */
            iparm[26] = 1; /* PARDISO checks integer arrays ia and ja. */
            iparm[34] = 1; /* C-style indexing. starts from 0. */

            mtype = 2;      /*Real and symmetric positive definite */
            maxfct = 1; /* Maximum number of numerical factorizations. */
            mnum = 1; /* Which factorization to use. */
            msglvl = 1; /* Print statistical information in file */

nrhs = 1;

After reordeing I get the following error

Percentage of computed non-zeros for LL^T factorization
*** error PARDISO: iterative refinement
 contraction rate is greater than 0.9, interrupt

and during factorization, Pardiso throws error flag -4 and stops continuing.

I have tried both ruining  the problem with default settings and also with setting iparm[1] to 2 and 3. But, unfortunately, none of them solved the problem. I wonder if someone has ever experienced something like this. 

Additional information:

=== PARDISO: solving a symmetric positive definite system ===
The local (internal) PARDISO version is                          : 103911000
0-based array is turned ON
PARDISO double precision computation is turned ON

< Parallel Direct Factorization with number of processors: > 42
< Numerical Factorization with BLAS3 and O(n) synchronization >

=== PARDISO: solving a symmetric positive definite system ===
Single-level factorization algorithm is turned ON




0 Kudos
5 Replies
Gennady_F_Intel
Moderator
536 Views

it may the problem with Pardiso. Is it lp64 or ILP64 libraries?

0 Kudos
Gennady_F_Intel
Moderator
536 Views

Would you please give us the all iparm[] and all output you received with msglvl ==1.  

0 Kudos
Meysam_J_
Beginner
536 Views

Dear Gennady,

Thanks for the quick reply. The whole output which I get by using msglvl=1 is as the following.

=== PARDISO: solving a symmetric positive definite system ===
The local (internal) PARDISO version is                          : 103911000
0-based array is turned ON
PARDISO double precision computation is turned ON
Minimum degree algorithm at reorder step is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 21.151494 s
Time spent in reordering of the initial matrix (reorder)         : 13.216936 s
Time spent in symbolic factorization (symbfct)                   : 36.151713 s
Time spent in data preparations for factorization (parlist)      : 0.402538 s
Time spent in allocation of internal data structures (malloc)    : 3.019229 s
Time spent in additional calculations                            : 93.399159 s
Total time spent                                                 : 167.341069 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 42
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           1016451
             number of non-zeros in A:      569392326
             number of non-zeros in A (%): 0.055111

             number of right-hand sides:    1

< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
             number of columns for each panel: 64
             number of independent subgraphs:  0
             number of supernodes:                    20561
             size of largest supernode:               50862
             number of non-zeros in L:                8648670096
             number of non-zeros in U:                1
             number of non-zeros in L+U:              8648670097
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
*** error PARDISO: iterative refinement
 contraction rate is greater than 0.9, interrupt

=== PARDISO: solving a symmetric positive definite system ===
Single-level factorization algorithm is turned ON


Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)                        : 0.000000 s
Time spent in allocation of internal data structures (malloc)    : 0.000561 s
Time spent in additional calculations                            : 0.436247 s
Total time spent                                                 : 0.436808 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 42
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           1016451
             number of non-zeros in A:      569392326
             number of non-zeros in A (%): 0.055111

             number of right-hand sides:    1

< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
             number of columns for each panel: 64
             number of independent subgraphs:  0
             number of supernodes:                    20561
             size of largest supernode:               50862
             number of non-zeros in L:                8648670096
             number of non-zeros in U:                1
             number of non-zeros in L+U:              8648670097
             gflop   for the numerical factorization: 248732.491817

then I Pardiso stops by sending the error = -4

The settings for the iparm is as follow:

    for (i = 0; i < 64; i++)
    {
        iparm = 0;
    }

    iparm[0] = 1; /* No solver default */
    iparm[1] = 0;

iparm[3] = 0; /* No iterative-direct algorithm */
            iparm[4] = 0; /* No user fill-in reducing permutation */
            iparm[5] = 0; /* Write solution into x */
            iparm[7] = 1; /* Max numbers of iterative refinement steps */
            iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
            iparm[10] = 0; /* Use nonsymmetric permutation and scaling MPS */
            iparm[12] = 0; /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
            iparm[13] = 0; /* Output: Number of perturbed pivots */
            iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
            iparm[18] = 0; /* disable Output: Mflops for LU factorization */
            iparm[19] = 0; /* Output: Numbers of CG Iterations */
            iparm[26] = 1; /* PARDISO checks integer arrays ia and ja. */
            iparm[34] = 1; /* C-style indexing. starts from 0. */

            mtype = 2;      /*Real and symmetric positive definite */
            maxfct = 1; /* Maximum number of numerical factorizations. */
            mnum = 1; /* Which factorization to use. */
            msglvl = 0; /* Print statistical information in file */

nrhs = 1;

 

With regard to the ILP64 and LP64, I am not so sure which I am using. I am confused right now! I have a 64-bit system and I have defined in my code, which is written in C) the integers as int (not MKL_INT). However during compilation I use the following commands in the make file

mkl = on

ifeq ($(mkl),on)
     LIB_MKL =   -L$MKLPATH -I$MKLINCLUDE -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread
else

MKLPATH=/opt/intel/mkl/lib/intel64/

MKLINCLUDE=/opt/intel/mkl/include/

I guess this must be ILP64, is not it?

 

 

 

0 Kudos
Meysam_J_
Beginner
536 Views

Dear Gennady,

Thanks for the quick reply. The whole output which I get by using msglvl=1 is as the following.

=== PARDISO: solving a symmetric positive definite system ===
The local (internal) PARDISO version is                          : 103911000
0-based array is turned ON
PARDISO double precision computation is turned ON
Minimum degree algorithm at reorder step is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 21.151494 s
Time spent in reordering of the initial matrix (reorder)         : 13.216936 s
Time spent in symbolic factorization (symbfct)                   : 36.151713 s
Time spent in data preparations for factorization (parlist)      : 0.402538 s
Time spent in allocation of internal data structures (malloc)    : 3.019229 s
Time spent in additional calculations                            : 93.399159 s
Total time spent                                                 : 167.341069 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 42
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           1016451
             number of non-zeros in A:      569392326
             number of non-zeros in A (%): 0.055111

             number of right-hand sides:    1

< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
             number of columns for each panel: 64
             number of independent subgraphs:  0
             number of supernodes:                    20561
             size of largest supernode:               50862
             number of non-zeros in L:                8648670096
             number of non-zeros in U:                1
             number of non-zeros in L+U:              8648670097
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
*** error PARDISO: iterative refinement
 contraction rate is greater than 0.9, interrupt

=== PARDISO: solving a symmetric positive definite system ===
Single-level factorization algorithm is turned ON


Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)                        : 0.000000 s
Time spent in allocation of internal data structures (malloc)    : 0.000561 s
Time spent in additional calculations                            : 0.436247 s
Total time spent                                                 : 0.436808 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 42
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           1016451
             number of non-zeros in A:      569392326
             number of non-zeros in A (%): 0.055111

             number of right-hand sides:    1

< Factors L and U >
< Preprocessing with multiple minimum degree, tree height >
< Reduction for efficient parallel factorization >
             number of columns for each panel: 64
             number of independent subgraphs:  0
             number of supernodes:                    20561
             size of largest supernode:               50862
             number of non-zeros in L:                8648670096
             number of non-zeros in U:                1
             number of non-zeros in L+U:              8648670097
             gflop   for the numerical factorization: 248732.491817

then I Pardiso stops by sending the error = -4

The settings for the iparm is as follow:

    for (i = 0; i < 64; i++)
    {
        iparm = 0;
    }

    iparm[0] = 1; /* No solver default */
    iparm[1] = 0;

iparm[3] = 0; /* No iterative-direct algorithm */
            iparm[4] = 0; /* No user fill-in reducing permutation */
            iparm[5] = 0; /* Write solution into x */
            iparm[7] = 1; /* Max numbers of iterative refinement steps */
            iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
            iparm[10] = 0; /* Use nonsymmetric permutation and scaling MPS */
            iparm[12] = 0; /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
            iparm[13] = 0; /* Output: Number of perturbed pivots */
            iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
            iparm[18] = 0; /* disable Output: Mflops for LU factorization */
            iparm[19] = 0; /* Output: Numbers of CG Iterations */
            iparm[26] = 1; /* PARDISO checks integer arrays ia and ja. */
            iparm[34] = 1; /* C-style indexing. starts from 0. */

            mtype = 2;      /*Real and symmetric positive definite */
            maxfct = 1; /* Maximum number of numerical factorizations. */
            mnum = 1; /* Which factorization to use. */
            msglvl = 0; /* Print statistical information in file */

nrhs = 1;

 

With regard to the ILP64 and LP64, I am not so sure which I am using. I am confused right now! I have a 64-bit system and I have defined in my code, which is written in C) the integers as int (not MKL_INT). However during compilation I use the following commands in the make file

mkl = on

ifeq ($(mkl),on)
     LIB_MKL =   -L$MKLPATH -I$MKLINCLUDE -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread
else

MKLPATH=/opt/intel/mkl/lib/intel64/

MKLINCLUDE=/opt/intel/mkl/include/

I guess this must be ILP64, is not it?

 

 

 

0 Kudos
Gennady_F_Intel
Moderator
536 Views

Hello Meysam,

it seems the bug in Pardiso dealt with the pivoting. In some cases setting mtype == -2 can help. Please try to check if it will help.

regards, Gennady

 

 

 

0 Kudos
Reply