Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Kostas_S_
Beginner
113 Views

PARDISO consistent crash for INCORE RUN

Hi, I am trying to solve several big 3d solid FE models with PARDISO 11.2

Although the out-of-core run is successful I am consistently getting segmentation fault errors for the in core runs.

This also happens with pardiso_64 and cpardiso when only 1 mpi process is used

With more than 1 mpi processes the run is successful.

The error is reproducible and occurs for almost all big models which I have tried.

Thanks

Kostas

0 Kudos
11 Replies
Gennady_F_Intel
Moderator
113 Views

Hi,  Have you checked the error returned by Pardiso? may be this is == -2 ( not enough memory)? 

Kostas_S_
Beginner
113 Views

Unfortuately, no error is returned neither from symbolic phase or factotization phase. It just crashes inside the factorization phase at about 1 or 2 %. Sometimes the crash is followed by a message corrupted double linked list or double free malloc. Of course the required memory is huge (60 or 70 GB) but the machine has plenty.
Gennady_F_Intel
Moderator
113 Views

Kostas, Could you set  msglvl == 1 and give us statistical information you will receive?

 

Kostas_S_
Beginner
113 Views

Hi. Please find below reported statistics for two crashing models.

3D solid model (knuckle) 600K 10-node TETRA

=== PARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 2.234720 s
Time spent in reordering of the initial matrix (reorder)         : 0.014349 s
Time spent in symbolic factorization (symbfct)                   : 5.550181 s
Time spent in data preparations for factorization (parlist)      : 0.195583 s
Time spent in allocation of internal data structures (malloc)    : 40.220556 s
Time spent in additional calculations                            : 7.978722 s
Total time spent                                                 : 56.194111 s

Statistics:
===========
Parallel Direct Factorization is running on 8 OpenMP

< Linear system Ax = b >
             number of equations:           2798127
             number of non-zeros in A:      117286720
             number of non-zeros in A (%): 0.001498

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    330907
             size of largest supernode:               26616
             number of non-zeros in L:                6990935225
             number of non-zeros in U:                1
             number of non-zeros in L+U:              6990935226
 
 *** INFORMATION # 3438
 PARDISO solver requires 65123 MB for selected incore execution.
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
 0 Signal 11 :: SIGSEGV

mixed solid-shell model (carbody) 1.7M 8-node QUADS, 300K 10-node TETRA

=== PARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 2.548694 s
Time spent in reordering of the initial matrix (reorder)         : 0.030692 s
Time spent in symbolic factorization (symbfct)                   : 9.595601 s
Time spent in data preparations for factorization (parlist)      : 0.240214 s
Time spent in allocation of internal data structures (malloc)    : 74.102451 s
Time spent in additional calculations                            : 15.991493 s
Total time spent                                                 : 102.509145 s

Statistics:
===========
Parallel Direct Factorization is running on 8 OpenMP

< Linear system Ax = b >
             number of equations:           11495787
             number of non-zeros in A:      311214695
             number of non-zeros in A (%): 0.000235

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    1278886
             size of largest supernode:               10939
             number of non-zeros in L:                3733796105
             number of non-zeros in U:                1
             number of non-zeros in L+U:              3733796106
 
 *** INFORMATION # 3438
 PARDISO solver requires 47351 MB for selected incore execution.
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
 0  1 Signal 11 :: SIGSEGV
Signal 11 :: SIGSEGV

 

Gennady_F_Intel
Moderator
113 Views

Ok, thanks.

Is that Linux or Windows OS?

What MKL version you are using?  

What is the available size of RAM on your system? 

 

Kostas_S_
Beginner
113 Views

Hi

This is Linux OS. The MKL version is 11.2, but I am not sure which update because I wasn't the one who installed it.

The install directory is /opt/intel/composer_xe_2015.2.164

The available ram on the system is 192GB but probably about 130GB were available at the time of the runs.

Thanks

Gennady_F_Intel
Moderator
113 Views

Ok, we will try to emulate the behavior on our side and will back soon.

Gennady_F_Intel
Moderator
113 Views

Hello, 

We checked how such type of task work on our side.  We didn't see the problem with in-core version while solving the 8*10^6 symmetric indefinite system . This case requires ~ 120 GB ( 15*10^9 of  non-zeros in L+U ). The task finished successfully.  see below the log we received. The MKL version we have used is 11.3 beta ( the latest version which we are working on ).  For your cases, we need to have your the matrix and reproducer this problem. All of this stuff, you can give us via private thread.

--Gennady

====================== below is the statistical info from our side ==============

=== PARDISO: solving a symmetric indefinite system ===
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.389905 s
Time spent in reordering of the initial matrix (reorder)         : 84.442869 s
Time spent in symbolic factorization (symbfct)                   : 51.253236 s
Time spent in data preparations for factorization (parlist)      : 1.419931 s
Time spent in allocation of internal data structures (malloc)    : 3.260579 s
Time spent in additional calculations                            : 4.545302 s
Total time spent                                                 : 145.311822 s

Statistics:
===========
Parallel Direct Factorization is running on 40 OpenMP

< Linear system Ax = b >
             number of equations:           8000000
             number of non-zeros in A:      31880000
             number of non-zeros in A (%): 0.000050

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    5219408
             size of largest supernode:               40119
             number of non-zeros in L:                14765464233
             number of non-zeros in U:                1
             number of non-zeros in L+U:              14765464234
 time(reorder)=   145.405552864075     
=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
 0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  44  45  46  47  48  49  50  51  52  53  54  55  56  58  59  60  61  62  63  64  65  66  67  68  70  71  72  73  75  76  77  78  79  80  81  83  84  86  87  88  89  90  92  93  94  95  96  97  98  99  100 

=== PARDISO: solving a symmetric indefinite system ===
Single-level factorization algorithm is turned ON


Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)                        : 2579.202247 s
Time spent in allocation of internal data structures (malloc)    : 0.000056 s
Time spent in additional calculations                            : 0.000002 s
Total time spent                                                 : 2579.202305 s

Statistics:
===========
Parallel Direct Factorization is running on 40 OpenMP

< Linear system Ax = b >
             number of equations:           8000000
             number of non-zeros in A:      31880000
             number of non-zeros in A (%): 0.000050

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    5219408
             size of largest supernode:               40119
             number of non-zeros in L:                14765464233
             number of non-zeros in U:                1
             number of non-zeros in L+U:              14765464234
             gflop   for the numerical factorization: 387596.095197

             gflop/s for the numerical factorization: 150.277511

 time(factor)=   2579.20247197151     

=== PARDISO: solving a symmetric indefinite system ===


Summary: ( solution phase )
================

Times:
======
Time spent in direct solver at solve step (solve)                : 24.682167 s
Time spent in additional calculations                            : 51.274367 s
Total time spent                                                 : 75.956534 s

Statistics:
===========
Parallel Direct Factorization is running on 40 OpenMP

< Linear system Ax = b >
             number of equations:           8000000
             number of non-zeros in A:      31880000
             number of non-zeros in A (%): 0.000050

             number of right-hand sides:    1

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    5219408
             size of largest supernode:               40119
             number of non-zeros in L:                14765464233
             number of non-zeros in U:                1
             number of non-zeros in L+U:              14765464234

             gflop   for the numerical factorization: 387596.095197

             gflop/s for the numerical factorization: 150.277511

 time(solve)=   75.9567329883575     
         200         200         200   2800.56483387947     
 csr  norm of residual   4.719264874719631E-016

Kostas_S_
Beginner
113 Views

Thank you for looking into this.

Indeed, I have also performed successful incore runs with huge models where the requested memory was more than 110GB.

But I get this segmentation fault with many models at more or less the same point of the factorization phase.

Do you think that may be this is fixed in version 11.3 beta because this is not the version that I am using.

I could dump the matrix and rhs to a file, but I assume it will be many GB large so transfer would take quite some time.

If that is ok with you, I can prepare the data and you can give me details where I can upload them.

Gennady_F_Intel
Moderator
113 Views

a couple of issues with in-core version of Pardiso have been fixed in 11.3 beta and you can try to check the problem with this version. How to take this version please refer to this page at the Top of the this forum: https://software.intel.com/en-us/forums/topic/549590

If the problem would still exist with 11.3 beta, then you can provide us this matrix as smallest as possible for reproducing the issue. You may use Intel(R) Premier Support channel to submit this issue and upload there this matrix.

--Gennady

Kostas_S_
Beginner
113 Views

Thanks

I will try this asap and act accordingly

Reply