Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
58 Views

Error in PARDISO ( numerical_factorization) error_num= -987

Hello,

I try to solve a sparse system with pardiso, using the evaluation version of the Beta of the MKL
on Windows 7, 64.

As I have to enable out-of-core if necessary I initialize the parameters as follows:

m_piparm[0] = 1; // No solver default
m_piparm[1] = 2;
m_piparm[9] = 0;
m_piparm[17] = -1;
m_piparm[20] = 1;
m_piparm[26] = 1;
m_piparm[59] = 1; // out off core if necessary


Here is the trace of the pardiso run. Any help is appreciated, and if necessary
I could dump the sparse symmetric matrix in a file and make it available.

Best regards,

Andreas Fabri


=== PARDISO is running in Out-Of-Core mode, because iparam(60)=1 and there is no
t enough RAM for In-Core ===


================ PARDISO: solving a symm. posit. def. system ================


Summary PARDISO: ( reorder to reorder )
================

Times:
======
Time fulladj: 1.618750 s
Time reorder: 48.901887 s
Time symbfct: 6.202610 s
Time malloc : 1.084790 s
Time total : 85.589953 s total - sum: 27.781916 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2797565
#non-zeros in A: 23286826
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300322
size of largest supernode: 3421
number of nonzeros in L 604905508
number of nonzeros in U 1
number of nonzeros in L+U 604905509
Percentage of computed non-zeros for LL^T factorization
0 %
1 %
.
.
44 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown

================ PARDISO: solving a symm. posit. def. system ================


Summary PARDISO: ( factorize to factorize )
================

Times:
======
Time A to LU: 0.000000 s
Factorization: Time for writing to files : 0.000000
Factorization: Time for reading from files : 0.000000
Time numfct : 0.000000 s
Time malloc : 0.053992 s
Time total : 105.836084 s total - sum: 105.782091 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2797565
#non-zeros in A: 23286826
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300322
size of largest supernode: 3421
number of nonzeros in L 604905508
number of nonzeros in U 1
number of nonzeros in L+U 604905509
gflop for the numerical factorization: 886.436031


The error code is : -4
0 Kudos
24 Replies
Highlighted
Black Belt
57 Views

This is just a guess -- I have no experience with huge matrices--:

An out-of-core solver needs to write and read large temporary files, so the 'fseek error' suggests that you look at the possibility that the program ran out of disk space while processing the temporary files.
0 Kudos
Highlighted
57 Views

Hi,

This problem could occur when during LL^T decomposition zero or negative diagonalelement appeared. Try to change mtype =2 on mtype = -2, probably it could resolve the problem.
With best regards,
Alexander Kalinkin
0 Kudos
Highlighted
57 Views

Switching to mtype=-2 did not help. Here is the output.
As you are from Intel. the error_num -987 should help
you to help me, shouldn't it?


best regards,

andreas


The file .\pardiso_ooc.cfg was not opened

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=1 and there is no
t enough RAM for In-Core ===


================ PARDISO: solving a symmetric indef. system ================


Summary PARDISO: ( reorder to reorder )
================

Times:
======
Time fulladj: 1.662469 s
Time reorder: 49.211687 s
Time symbfct: 6.262312 s
Time malloc : 1.055497 s
Time total : 86.830331 s total - sum: 28.638366 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2796570
#non-zeros in A: 23279108
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300079
size of largest supernode: 3576
number of nonzeros in L 588215272
number of nonzeros in U 1
number of nonzeros in L+U 588215273
Percentage of computed non-zeros for LL^T factorization
0 Kudos
Highlighted
57 Views

Hi Andreas

The error=-987 is internal error that couldn't appeared in normal situation. Could you check your matrix by setting iparm(27) = 1 in Fortran (iparm[26] in C) and size of free memory on hard disk (you must have around 8Gb free space on HDD). If everything is correct could you send testcase (example with matrix that chrashed) to investigate problem?
With best regards,
Alexander Kalinkin
0 Kudos
Highlighted
Moderator
57 Views

Andreas, how about free space availble on your system?
nnz is ~ 588215272 will require ~ 5 Gb memory available
--Gennady
0 Kudos
Highlighted
57 Views

Hello,

I have 83 GB available, so disk space should not be the problem.

I also had alreadyt set iparm[26]. For completeness, here are the other parameters I've set.
Could you verify that they are correct. I find it rather error-prone that when I only want
to change one parameter(as out of core), I must figure out for all the others, what the default is.

m_piparm[0] = 1; // No solver default
m_piparm[1] = 2;
m_piparm[9] = 8; // iparm(10)- pivoting perturbation.
m_piparm[17] = -1;
m_piparm[20] = 1;
m_piparm[26] = 1;
m_piparm[59] = 1; // out off core if necessary


Do you have any standard file format that I should use for storing the system?

Best regards,

andreas
0 Kudos
Highlighted
Moderator
57 Views

Andreas,
What MKL beta version you are evaluate?
Could you check how it will works with clear OOC mode ( iparm[59] == 2) instead of hybrid mode you are using.
--Gennady
0 Kudos
Highlighted
57 Views


I downloaded w_mkl_10.3.0.055.exe

Concerning the temporary file, in which directory does it go?
I ask because I am wondering what happens when the virus scanner
(Norton) tries to check it.

andreas
0 Kudos
Highlighted
Moderator
57 Views

by default -the OOC PARDISO uses the current directory for storing data.
0 Kudos
Highlighted
Moderator
57 Views

Thierry,
the same problem with OOC or hybryd mode?
--Gennady
0 Kudos
Highlighted
57 Views

I tested the two modes (iparam[59]=1 and iparam[59]=2) without success.

I am using MKL 10.2.5.035 with Visual Studio 2008 on Windows 7 x64

Thierry
0 Kudos
Highlighted
Moderator
57 Views

well and you had the similar error == -987?

0 Kudos
Highlighted
57 Views

Here is the log :

ooc_path got by Env = C:\Dev\OptimTopo\Code\ooc_file
ooc_max_core_size got by Env = 3000
ooc_keep_file got by Env = 1

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
40 %
41 %
42 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
*** Error in PARDISO: zero pivot

================ PARDISO: solving a real struct. sym. system ================


Summary PARDISO: ( reorder to factorize )
================

Times:
======
Time fulladj: 0.134167 s
Time reorder: 4.507111 s
Time symbfct: 2.230421 s
Time parlist: 2.000479 s
Time A to LU: 0.000000 s
Factorization: Time for writing to files : 0.000000
Factorization: Time for reading from files : 0.000000
Time numfct : 0.000000 s
Time malloc : 10.436600 s
Time total : 294.919680 s total - sum: 275.610902 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 4
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b>
#equations: 408483
#non-zeros in A: 31756329
non-zeros in A (): 0.019032

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 96
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 39349
size of largest supernode: 9840
number of nonzeros in L 626636223
number of nonzeros in U 605550762
number of nonzeros in L+U 1232186985
gflop for the numerical factorization: 5644.826505


ERROR during symbolic and numerical factorization: -4*** Error in PARDISO (read/write OOC data file) error_num= 0


0 Kudos
Highlighted
57 Views

I tried MKL 10.3 Beta and i had the same error.

With iparam[27]=0, I got :

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
40 %
41 %
42 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown



With iparam[27]=1, I got :

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
83 %
84 %
85 %
Fseek failed
Fseek failed
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown


This can perhaps help you...
0 Kudos
Highlighted
Moderator
57 Views

Hello guys,because it is completely unknown to us the problem and our internal tests do not reproduce it,
I can only ask to send us this information.
At least this will allow us to significantly speed up this error investigation.
--Gennady
0 Kudos
Highlighted
57 Views

You can download my matrix (ia, ja and a arrays) here : http://lesommer.free.fr/matrix_ed_lesommer.zip
I know that my matrix has zero elements.
0 Kudos
Highlighted
Moderator
57 Views

Thanks, we will check and let you know if any update.
0 Kudos
Highlighted
New Contributor I
57 Views

Hello,

We downloaded matrix and successfully factorized it with MKL10.2.5 (see log below).

May be the problem is in free space on hard disc. Number of LU-factors is 1 232 186 985. To store them on hard disc, MKL OOC PARDISO requires about 12GB free space (1 232 186985 *8Byte).

How much free space is on hard disc? Also, please print out iparam[63]. It is internal parameter, which can help us identify the version of MKL PARDISO.

************************************ ooc_max_core_size got by Env = 3000

The file .\pardiso_ooc.cfg was not opened

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %

1 %

2 %

3 %

...

98 %

99 %

100 %

================ PARDISO: solving a real struct. sym. system ================

Summary PARDISO: ( reorder to factorize )

================

Times:

======

Time fulladj: 0.115263 s

Time reorder: 3.636191 s

Time symbfct: 3.471022 s

Time parlist: 0.321256 s

Time A to LU: 0.000000 s

Factorization: Time for writing to files : 0.000000

Factorization: Time for reading from files : 0.000000

Time numfct : 428.636476 s

Time malloc : 0.586887 s

Time total : 440.670663 s total - sum: 3.903568 s

Statistics:

===========

< Parallel Direct Factorization with #processors: > 4

< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b>

#equations: 408483

#non-zeros in A: 31756329

non-zeros in A (): 0.019032

#right-hand sides: 1

< Factors L and U >

#columns for each panel: 96

#independent subgraphs: 0

< Preprocessing with state of the art partitioning metis>

#supernodes: 39349

size of largest supernode: 9840

number of nonzeros in L 626636223

number of nonzeros in U 605550762

number of nonzeros in L+U 1232186985

gflop for the numerical factorization: 5644.826505

gflop/s for the numerical factorization: 13.169263

0 Kudos
Highlighted
57 Views

Hello,

The free space on the hard disk is not the problem. I have 100Go free.

I think I found the problem. This comes from the library mkl_intel_thread.lib.
With mkl_intel_thread.lib => OK
With mkl_intel_thread_dll.lib => Error -987

Now it works for me with the versions : 10.2.5, 10.2.6 and 10.3.0 beta

Thierry
0 Kudos