Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Pardiso - error with user ordering

damienhocking
New Contributor I
1,035 Views

I've been using Pardiso for a while now and I've started testing other ordering techniques. One that I really like is QAMD for quasi-dense rows, because it's very fast. I'm getting errors from Pardiso with a user-generated ordering on an usymmetric matrix. The problem is scalable, Pardiso doesn't fail on the smaller matrices but it does on the larger ones. It fails on the numeric factorisation, and says error -20. This is the 10.2 update 4 Windows release of Intel Pardiso that comes with the 11.1.060 compiler.

I have a few questions.

1) According to the docs, to use your own ordering, you set iparm(5) = 1 and set the perm array before a call to Pardiso with phase = 11. Is that all you need to do? I couldn't find anything about what to do with iparm(2) in this case, so I assumed Pardiso would ignore it.

2) Pardiso wants the perm array as a row permutation, correct? The reordering code I'm using returns a column permutation, so I send it the transpose of A to get an equivalent row ordering.

3) If I'm not doing anything wrong in my calling parameters, I might need to submit the matrix and the permutation vector. What format should they be in?

Damien

0 Kudos
9 Replies
Gennady_F_Intel
Moderator
1,035 Views

Damien,

It would be more useful ( this is the fastest way to find out the cause of the problem) if you submit the whole testcase ( source code + input data).

error == -20? I couldn't find such error numbers in documentation for 10.2 update 4

--Gennady

0 Kudos
Konstantin_A_Intel
1,035 Views

Hello Damien,

Let me answer your questions:

1) Yes, you'd done all correctly. In case of user permutation iparm(2) parameter is just ignored.

2) PARDISO uses only symmetrical permutations (even for unsymmetrical matrices), so it means it will be applied to both colimns and rows.

3) As Gennady said, please sent the reference data.

Thank you,

Konstantin

0 Kudos
damienhocking
New Contributor I
1,035 Views

I'll put together a stand-alone example and attach it. Thanks.

-20 was a typo. It's -2.

Damien

0 Kudos
damienhocking
New Contributor I
1,033 Views
Here's the data and a small test driver. The symbolic phase dies on memory allocation.
0 Kudos
Gennady_F_Intel
Moderator
1,033 Views

Damien,error == -2 means not enough memory. If you use in-core mode of PARDISO, please try to solve this task by running PARDISO in out-of-core mode ( iparm[59] = 2, and set the MKL_PARDISO_OOC_MAX_CORE_SIZE <= RAM available at your system).

--Gennady

0 Kudos
Gennady_F_Intel
Moderator
1,033 Views

Damien,

I checked your problem on my local system :Core 2Duo system, RAM 2 GB ( but available memory was 1.1 Gb only. Because memory for solution your task (nequations= 800004 and nnz= 2096005) requires much larger memory then 1.1 Gb, PARDISO has been launched in out-of-core mode with MKL_PARDISO_OOC_MAX_CORE_SIZE=1000).
PARDISO solved the task successfully. See the output below:
--Gennady
+++++++++++++++++++++++++++++++++++++
The file .\pardiso_ooc.cfg was not opened
=== PARDISO is running in Out-Of-Core mode, because iparam(60)=2 ===
================ PARDISO: solving a real nonsymmetric system ================
Summary PARDISO: ( reorder to reorder )
================
Times:
======
Time fulladj: 0.232677 s
Time reorder: 0.008773 s
Time symbfct: 5.178061 s
Time malloc : 3.468740 s
Time total : 186.823628 s total - sum: 177.935377 s
Statistics:
===========
1
#equations: 800004
#non-zeros in A: 2096005
non-zeros in A (): 0.000327
#right-hand sides: 1
#columns for each panel: 128
#independent subgraphs: 0
#supernodes: 784130
size of largest supernode: 8000
number of nonzeros in L 153841611
number of nonzeros in U 151017799
number of nonzeros in L+U 304859410
Reordering completed ... Percentage of computed non-zeros for LL^T factorizatio
n
0 %
1 %
2 %
3 %
4 %
5 %
6 %
7 %
................................
98 %
99 %
100 %
================ PARDISO: solving a real nonsymmetric system ================
Summary PARDISO: ( factorize to factorize )
================
Times:
======
Time A to LU: 0.000000 s
Factorization: Time for writing to files : 141.430262
Factorization: Time for reading from files : 3901.309578
Time numfct : 6795.462132 s
Time malloc : 6.698081 s
Time total : 6802.782343 s total - sum: 0.622130 s
Statistics:
===========
1
#equations: 800004
#non-zeros in A: 2096005
non-zeros in A (): 0.000327
#right-hand sides: 1
#columns for each panel: 128
#independent subgraphs: 0
#supernodes: 784130
size of largest supernode: 8000
number of nonzeros in L 153841611
number of nonzeros in U 151017799
number of nonzeros in L+U 304859410
gflop for the numerical factorization: 3072.194687
gflop/s for the numerical factorization: 0.452095

0 Kudos
damienhocking
New Contributor I
1,033 Views

Wow. 150 *million* nonzeroes in the L and U factors? On a matrix with 2 million nonzeroes? I think that's the largest amount of fill-in I've ever seen on any problem.

I got that ordering from a call to the MUMPS solver. Using the same ordering and factorising with MUMPS it runs in about 0.4 seconds on one processor. There must be something about that ordering that Pardiso doesn't like, or I'm putting the ordering in the wrong way. Pardiso factorises this problem quite quickly with its own internal orderings. MUMPS has AMD and METIS available, I'll extract those orderings from Pardiso and MUMPS and compare them. Something's definitely not right here. If I discover anything useful I'll post back.

0 Kudos
damienhocking
New Contributor I
1,033 Views

I tested out the other orderings. They're similar but not the same, which makes sense because they're different codebases. Often they start out the same and then diverge. MUMPS returns a symmetric permutation, where perm is the position of variable i in the pivot order. Is that what Pardiso accepts?

0 Kudos
damienhocking
New Contributor I
1,033 Views

Allllllllrightythen. I found the problem. I'll skip the boring details of how I got to this. Depending on the implementation of the ordering code, the meaning of the symmetric permutation you get back can be different. For example,

perm[1] = 4

perm[2] = 87

perm[3] = 2

...

...

can mean two things. It can mean variable (or column) 1 permutes to column 4, v2 permutes to column 87, v3 permutes to column 2 etc. Or, it can mean the inverse. Permuted column 1 has the old column 4, permuted 2 has old 87, permuted 3 has old 2 etc. This was the difference. MUMPS is returning the equivalent of the second (inverse) method. Once I flipped that around, Pardiso can factorise that system in under 1 second.

Problem solved. Thank you Gennady and Konstantin for the help.

0 Kudos
Reply