Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6977 Discussions

Error in PARDISO ( numerical_factorization) error_num= -987

Andreas_Fabri__Geome
1,616 Views
Hello,

I try to solve a sparse system with pardiso, using the evaluation version of the Beta of the MKL
on Windows 7, 64.

As I have to enable out-of-core if necessary I initialize the parameters as follows:

m_piparm[0] = 1; // No solver default
m_piparm[1] = 2;
m_piparm[9] = 0;
m_piparm[17] = -1;
m_piparm[20] = 1;
m_piparm[26] = 1;
m_piparm[59] = 1; // out off core if necessary


Here is the trace of the pardiso run. Any help is appreciated, and if necessary
I could dump the sparse symmetric matrix in a file and make it available.

Best regards,

Andreas Fabri


=== PARDISO is running in Out-Of-Core mode, because iparam(60)=1 and there is no
t enough RAM for In-Core ===


================ PARDISO: solving a symm. posit. def. system ================


Summary PARDISO: ( reorder to reorder )
================

Times:
======
Time fulladj: 1.618750 s
Time reorder: 48.901887 s
Time symbfct: 6.202610 s
Time malloc : 1.084790 s
Time total : 85.589953 s total - sum: 27.781916 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2797565
#non-zeros in A: 23286826
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300322
size of largest supernode: 3421
number of nonzeros in L 604905508
number of nonzeros in U 1
number of nonzeros in L+U 604905509
Percentage of computed non-zeros for LL^T factorization
0 %
1 %
.
.
44 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown

================ PARDISO: solving a symm. posit. def. system ================


Summary PARDISO: ( factorize to factorize )
================

Times:
======
Time A to LU: 0.000000 s
Factorization: Time for writing to files : 0.000000
Factorization: Time for reading from files : 0.000000
Time numfct : 0.000000 s
Time malloc : 0.053992 s
Time total : 105.836084 s total - sum: 105.782091 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2797565
#non-zeros in A: 23286826
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300322
size of largest supernode: 3421
number of nonzeros in L 604905508
number of nonzeros in U 1
number of nonzeros in L+U 604905509
gflop for the numerical factorization: 886.436031


The error code is : -4
0 Kudos
24 Replies
mecej4
Honored Contributor III
1,407 Views
This is just a guess -- I have no experience with huge matrices--:

An out-of-core solver needs to write and read large temporary files, so the 'fseek error' suggests that you look at the possibility that the program ran out of disk space while processing the temporary files.
0 Kudos
Alexander_K_Intel2
1,407 Views
Hi,

This problem could occur when during LL^T decomposition zero or negative diagonalelement appeared. Try to change mtype =2 on mtype = -2, probably it could resolve the problem.
With best regards,
Alexander Kalinkin
0 Kudos
Andreas_Fabri__Geome
1,407 Views
Switching to mtype=-2 did not help. Here is the output.
As you are from Intel. the error_num -987 should help
you to help me, shouldn't it?


best regards,

andreas


The file .\pardiso_ooc.cfg was not opened

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=1 and there is no
t enough RAM for In-Core ===


================ PARDISO: solving a symmetric indef. system ================


Summary PARDISO: ( reorder to reorder )
================

Times:
======
Time fulladj: 1.662469 s
Time reorder: 49.211687 s
Time symbfct: 6.262312 s
Time malloc : 1.055497 s
Time total : 86.830331 s total - sum: 28.638366 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 2796570
#non-zeros in A: 23279108
non-zeros in A (): 0.000298

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 128
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 1300079
size of largest supernode: 3576
number of nonzeros in L 588215272
number of nonzeros in U 1
number of nonzeros in L+U 588215273
Percentage of computed non-zeros for LL^T factorization
0 Kudos
Alexander_K_Intel2
1,407 Views
Hi Andreas

The error=-987 is internal error that couldn't appeared in normal situation. Could you check your matrix by setting iparm(27) = 1 in Fortran (iparm[26] in C) and size of free memory on hard disk (you must have around 8Gb free space on HDD). If everything is correct could you send testcase (example with matrix that chrashed) to investigate problem?
With best regards,
Alexander Kalinkin
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
Andreas, how about free space availble on your system?
nnz is ~ 588215272 will require ~ 5 Gb memory available
--Gennady
0 Kudos
Andreas_Fabri__Geome
1,407 Views
Hello,

I have 83 GB available, so disk space should not be the problem.

I also had alreadyt set iparm[26]. For completeness, here are the other parameters I've set.
Could you verify that they are correct. I find it rather error-prone that when I only want
to change one parameter(as out of core), I must figure out for all the others, what the default is.

m_piparm[0] = 1; // No solver default
m_piparm[1] = 2;
m_piparm[9] = 8; // iparm(10)- pivoting perturbation.
m_piparm[17] = -1;
m_piparm[20] = 1;
m_piparm[26] = 1;
m_piparm[59] = 1; // out off core if necessary


Do you have any standard file format that I should use for storing the system?

Best regards,

andreas
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
Andreas,
What MKL beta version you are evaluate?
Could you check how it will works with clear OOC mode ( iparm[59] == 2) instead of hybrid mode you are using.
--Gennady
0 Kudos
Andreas_Fabri__Geome
1,407 Views

I downloaded w_mkl_10.3.0.055.exe

Concerning the temporary file, in which directory does it go?
I ask because I am wondering what happens when the virus scanner
(Norton) tries to check it.

andreas
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
by default -the OOC PARDISO uses the current directory for storing data.
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
Thierry,
the same problem with OOC or hybryd mode?
--Gennady
0 Kudos
Thierry_LE_SOMMER__E
1,407 Views
I tested the two modes (iparam[59]=1 and iparam[59]=2) without success.

I am using MKL 10.2.5.035 with Visual Studio 2008 on Windows 7 x64

Thierry
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
well and you had the similar error == -987?

0 Kudos
Thierry_LE_SOMMER__E
1,407 Views
Here is the log :

ooc_path got by Env = C:\Dev\OptimTopo\Code\ooc_file
ooc_max_core_size got by Env = 3000
ooc_keep_file got by Env = 1

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
40 %
41 %
42 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
*** Error in PARDISO: zero pivot

================ PARDISO: solving a real struct. sym. system ================


Summary PARDISO: ( reorder to factorize )
================

Times:
======
Time fulladj: 0.134167 s
Time reorder: 4.507111 s
Time symbfct: 2.230421 s
Time parlist: 2.000479 s
Time A to LU: 0.000000 s
Factorization: Time for writing to files : 0.000000
Factorization: Time for reading from files : 0.000000
Time numfct : 0.000000 s
Time malloc : 10.436600 s
Time total : 294.919680 s total - sum: 275.610902 s

Statistics:
===========
< Parallel Direct Factorization with #processors: > 4
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b>
#equations: 408483
#non-zeros in A: 31756329
non-zeros in A (): 0.019032

#right-hand sides: 1

< Factors L and U >
#columns for each panel: 96
#independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
#supernodes: 39349
size of largest supernode: 9840
number of nonzeros in L 626636223
number of nonzeros in U 605550762
number of nonzeros in L+U 1232186985
gflop for the numerical factorization: 5644.826505


ERROR during symbolic and numerical factorization: -4*** Error in PARDISO (read/write OOC data file) error_num= 0


0 Kudos
Thierry_LE_SOMMER__E
1,407 Views
I tried MKL 10.3 Beta and i had the same error.

With iparam[27]=0, I got :

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
40 %
41 %
42 %
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown



With iparam[27]=1, I got :

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %
1 %
2 %
...
83 %
84 %
85 %
Fseek failed
Fseek failed
Fseek failed
*** Error in PARDISO ( numerical_factorization) error_num= -987
PARDISO Internationalization error! Message -987 is unknown


This can perhaps help you...
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
Hello guys,because it is completely unknown to us the problem and our internal tests do not reproduce it,
I can only ask to send us this information.
At least this will allow us to significantly speed up this error investigation.
--Gennady
0 Kudos
Thierry_LE_SOMMER__E
1,407 Views
You can download my matrix (ia, ja and a arrays) here : http://lesommer.free.fr/matrix_ed_lesommer.zip
I know that my matrix has zero elements.
0 Kudos
Gennady_F_Intel
Moderator
1,407 Views
Thanks, we will check and let you know if any update.
0 Kudos
Sergey_Solovev__Inte
New Contributor I
1,407 Views

Hello,

We downloaded matrix and successfully factorized it with MKL10.2.5 (see log below).

May be the problem is in free space on hard disc. Number of LU-factors is 1 232 186 985. To store them on hard disc, MKL OOC PARDISO requires about 12GB free space (1 232 186985 *8Byte).

How much free space is on hard disc? Also, please print out iparam[63]. It is internal parameter, which can help us identify the version of MKL PARDISO.

************************************ ooc_max_core_size got by Env = 3000

The file .\pardiso_ooc.cfg was not opened

=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===

Percentage of computed non-zeros for LL^T factorization
0 %

1 %

2 %

3 %

...

98 %

99 %

100 %

================ PARDISO: solving a real struct. sym. system ================

Summary PARDISO: ( reorder to factorize )

================

Times:

======

Time fulladj: 0.115263 s

Time reorder: 3.636191 s

Time symbfct: 3.471022 s

Time parlist: 0.321256 s

Time A to LU: 0.000000 s

Factorization: Time for writing to files : 0.000000

Factorization: Time for reading from files : 0.000000

Time numfct : 428.636476 s

Time malloc : 0.586887 s

Time total : 440.670663 s total - sum: 3.903568 s

Statistics:

===========

< Parallel Direct Factorization with #processors: > 4

< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b>

#equations: 408483

#non-zeros in A: 31756329

non-zeros in A (): 0.019032

#right-hand sides: 1

< Factors L and U >

#columns for each panel: 96

#independent subgraphs: 0

< Preprocessing with state of the art partitioning metis>

#supernodes: 39349

size of largest supernode: 9840

number of nonzeros in L 626636223

number of nonzeros in U 605550762

number of nonzeros in L+U 1232186985

gflop for the numerical factorization: 5644.826505

gflop/s for the numerical factorization: 13.169263

0 Kudos
Thierry_LE_SOMMER__E
1,407 Views
Hello,

The free space on the hard disk is not the problem. I have 100Go free.

I think I found the problem. This comes from the library mkl_intel_thread.lib.
With mkl_intel_thread.lib => OK
With mkl_intel_thread_dll.lib => Error -987

Now it works for me with the versions : 10.2.5, 10.2.6 and 10.3.0 beta

Thierry
0 Kudos
Gennady_F_Intel
Moderator
1,196 Views
Thierry,
Could you please clarifyhow did you link application when the erro_num = -987 has been encountered?
then, we will try to reproduce the problem on our side also.
--Gennady
0 Kudos
Reply