Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

PARDISO segmentation fault

Koshkarev_A_
Beginner
2,955 Views

idbc wrote after 80% of LL' factorization:

Program received signal SIGSEGV
mkl_blas_mc_sgem2vu_odd () in /mnt/storage/opt/intel/composer_xe_2013_sp1.0.080/mkl/lib/intel64/libmkl_mc.so

in the attachment there is matrix with the program and makefile to reproduce this fault.

Matrix is CSR 3-array-variation 1-based (Upper triangle part of hermitian matrix) with about 22 000 000 nonzeros and 64000x64000 size

The same program with smaller size worked, max size tested 17280x17280.

The program executed on the: MACHTYPE=x86_64-suse-linux; HP DL580 G5 with 4x Intel Xeon 7350

0 Kudos
1 Solution
Kirill_V_Intel
Employee
1,519 Views

Hi John!

Of course I have a personal bias but I believe you would get a more decent support in case you start using Intel oneMKL PARDISO.

I am pretty sure that you will not have an issue like you described if you follow the described ways (like how to compile and link your code with oneMKL, e.g. from here https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl/link-line-adviso...). Even if smth goes wrong, on this forum you'd get an answer about what is wrong or how to do it properly.

Best,
Kirill

View solution in original post

46 Replies
Koshkarev_A_
Beginner
1,864 Views

edit: 640000x640000 matrix size

Gennady_F_Intel
Moderator
1,864 Views

thanks for the issue. we will check it on our side.

Koshkarev_A_
Beginner
1,864 Views

found that these parameters runs ok (mtype = -4):

    iparm[0]  = 1;            // No solver default
    iparm[1]  = 2;            // Fill-in reordering from METIS
    // Numbers of processors, value of OMP_NUM_THREADS
    iparm[2]  = 0;
    iparm[3]  = 0;            // No iterative-direct algorithm
    iparm[4]  = 0;            // No user fill-in reducing permutation
    iparm[5]  = 0;            // Write solution into x
    iparm[6]  = 0;            // Not in use
    iparm[7]  = 2;            // Max numbers of iterative refinement steps
    iparm[8]  = 0;            // Not in use
    iparm[9]  = 8;            // Perturb the pivot elements with 1E-8 (default for symmetric indefinite)
    iparm[10] = 0;            // disable scaling (default for symmetric indefinite)
    iparm[11] = 0;            // Conjugate transposed/transpose solve == non-transposed
    iparm[12] = 0;            // Maximum weighted matching algorithm is switched-off (default for symmetric indefinite)
    iparm[13] = 0;            // Output: Number of perturbed pivots
    iparm[14] = 0;            // Not in use
    iparm[15] = 0;            // Not in use
    iparm[16] = 0;            // Not in use
    iparm[17] = -1;            // Output: Number of nonzeros in the factor LU
    iparm[18] = -1;            // Output: Mflops for LU factorization
    iparm[19] = 0;            // Output: Numbers of CG Iterations
//user defined:
    iparm[20] = 1;            
    iparm[23] = 0;            // uses a two-level factorization algorithm. This algorithm generally improves scalability in case of parallel factorization on many threads (more than eight).
    iparm[26] = 0;            // matrix check
    iparm[59] = 0;            // OOC Mode is off

    maxfct = 1;            // Maximum number of numerical factorizations.
    mnum = 1;            // Which factorization to use.

 

 

but segmentation fault appears with these (mtype = -4):

    iparm[0] = 1;            /* No solver default */
    iparm[1] = 3;            /* Fill-in reordering from METIS */ // edited: OpenMP version 
    /* Numbers of processors, value of OMP_NUM_THREADS */
    iparm[2] = 1;
    iparm[3] = 0;            /* No iterative-direct algorithm */
    iparm[4] = 0;            /* No user fill-in reducing permutation */
    iparm[5] = 0;            /* Write solution into x */
    iparm[6] = 0;            /* Not in use */
    iparm[7] = 2;            /* Max numbers of iterative refinement steps */
    iparm[8] = 0;            /* Not in use */
    iparm[9] = 13;            /* Perturb the pivot elements with 1E-13 */
    iparm[10] = 1;            /* Use nonsymmetric permutation and scaling MPS */
    iparm[11] = 0;            /* Conjugate transposed/transpose solve == non-transposed*/
    iparm[12] = 0;            /* Maximum weighted matching algorithm is switched-on (default for non-symmetric) */
    iparm[13] = 0;            /* Output: Number of perturbed pivots */
    iparm[14] = 0;            /* Not in use */
    iparm[15] = 0;            /* Not in use */
    iparm[16] = 0;            /* Not in use */
    iparm[17] = -1;            /* Output: Number of nonzeros in the factor LU */
    iparm[18] = -1;            /* Output: Mflops for LU factorization */
    iparm[19] = 0;            /* Output: Numbers of CG Iterations */
    iparm[26] = 1;            // matrix checker is on
//    iparm[59] = 2;            // turn on the OOC Mode

    maxfct = 1;            /* Maximum number of numerical factorizations.  */
    mnum = 1;            /* Which factorization to use. */

 

Gennady_F_Intel
Moderator
1,864 Views

number of non-zeros in L:                5009690816
number of non-zeros in U:                1
number of non-zeros in L+U:              5009690817

You use in-core version for the case where for factorizations you have to have  memory available  at least sizeof(MKL_Complex16) * number of non-zeros in L+U .  ( 5*10^9 * 16 bytes >= 80 Gb ).

Do you have enough RAM of that system? Please check it first,  

I'd recommend to try OOC version. iparm[59] == 2

--Gennady

 

Koshkarev_A_
Beginner
1,864 Views

of course I use it on the server with RAM enough (132Gb, 2Tb) and during execution on the graph memory consumption does not exceed 80Gb.

I guess that the problem is in using "nonsymmetric permutation and scaling MPS" (iparm[10]=1) for big Hermitian matrices (smaller runs ok).

Gennady_F_Intel
Moderator
1,864 Views

but if I am not mistaken, you used LP64 interfaces which may not addres arrays with more than 231-1 elements... therefore ILP64 interfaces has to be used for such sort of probem. Did you try that?

Gennady_F_Intel
Moderator
1,864 Views

in any case, what would be output , in the case if iparm[14] = 1;    iparm[15] = 1;    iparm[16] = 1;      ?

 

Koshkarev_A_
Beginner
1,864 Views

in the case if

    iparm[0] = 1;            /* No solver default */

    iparm[1] = 3;            /* Fill-in reordering from METIS */ // edited: OpenMP version 
    /* Numbers of processors, value of OMP_NUM_THREADS */
    iparm[2] = 1;
    iparm[3] = 0;            /* No iterative-direct algorithm */
    iparm[4] = 0;            /* No user fill-in reducing permutation */
    iparm[5] = 0;            /* Write solution into x */
    iparm[6] = 0;            /* Not in use */
    iparm[7] = 2;            /* Max numbers of iterative refinement steps */
    iparm[8] = 0;            /* Not in use */
    iparm[9] = 13;            /* Perturb the pivot elements with 1E-13 */
    iparm[10] = 1;            /* Use nonsymmetric permutation and scaling MPS */
    iparm[11] = 0;            /* Conjugate transposed/transpose solve == non-transposed*/
    iparm[12] = 0;            /* Maximum weighted matching algorithm is switched-on (default for non-symmetric) */
    iparm[13] = 0;            /* Output: Number of perturbed pivots */
    iparm[14] = 1;            /* Not in use */
    iparm[15] = 1;            /* Not in use */
    iparm[16] = 1;            /* Not in use */
    iparm[17] = -1;            /* Output: Number of nonzeros in the factor LU */
    iparm[18] = -1;            /* Output: Mflops for LU factorization */
    iparm[19] = 0;            /* Output: Numbers of CG Iterations */
    iparm[26] = 1;            // matrix checker is on
//    iparm[59] = 2;            // turn on the OOC Mode

    maxfct = 1;            /* Maximum number of numerical factorizations.  */
    mnum = 1;            /* Which factorization to use. */

 

Intel(R) Debugger for applications running on Intel(R) 64, Version 13.0, Build [80.483.23]
------------------
object file name: ./main.out
Reading symbols from /mnt/storage/home/aakoshkarev/anton/Magneto/_results/intel_lp64_parallel_intel64_so/main.out...done.
(idb) run
Starting program: /mnt/storage/home/aakoshkarev/anton/Magneto/_results/intel_lp64_parallel_intel64_so/main.out
[New Thread 14487 (LWP 14487)]
[New Thread 15058 (LWP 15058)]
[New Thread 15059 (LWP 15059)]
[New Thread 15060 (LWP 15060)]
[New Thread 15084 (LWP 15084)]
[New Thread 15085 (LWP 15085)]
[New Thread 15086 (LWP 15086)]
[New Thread 15087 (LWP 15087)]
[New Thread 15088 (LWP 15088)]
[New Thread 15089 (LWP 15089)]
[New Thread 15121 (LWP 15121)]
[New Thread 15122 (LWP 15122)]
[New Thread 15123 (LWP 15123)]
[New Thread 15124 (LWP 15124)]
[New Thread 15149 (LWP 15149)]
[New Thread 15150 (LWP 15150)]

 Matrix A
 (  0.98,  0.00) (  0.19,  0.09)
 (  0.19, -0.09) ( -0.98,  0.00)

 Eigenvalues
 (  1.00, -0.00) ( -1.00,  0.00)

 Right eigenvectors
 (  0.99,  0.00) (  0.09,  0.04)
 ( -0.09,  0.04) (  0.99,  0.00)


Energy1 = <psi+|H1|psi+> = (0.000033 -0.000000)

maxcurrent = 23551999

<psi+|psi-> = (0.000000, -0.000000)

<psi+|s_z|psi+> = (0.999984, 0.000000)

<psi+|s_z|psi-> = (-0.000000, 0.000000)

<psi-|s_z|psi+> = (-0.000000, -0.000000)

<psi-|s_z|psi-> = (-0.999984, 0.000000)

=== PARDISO: solving a Hermitian indefinite system ===
The local (internal) PARDISO version is                          : 103911000
1-based array indexing is turned ON
PARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON


Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.397580 s
Time spent in reordering of the initial matrix (reorder)         : 19.229663 s
Time spent in symbolic factorization (symbfct)                   : 20.043214 s
Time spent in data preparations for factorization (parlist)      : 0.484405 s
Time spent in allocation of internal data structures (malloc)    : 2.318397 s
Time spent in additional calculations                            : 4.486993 s
Total time spent                                                 : 46.960252 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 15
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           640000
             number of non-zeros in A:      23552000
             number of non-zeros in A (%): 0.005750

             number of right-hand sides:    2

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    181459
             size of largest supernode:               46400
             number of non-zeros in L:                5035922220
             number of non-zeros in U:                1
             number of non-zeros in L+U:              5035922221

Reordering completed ...
Number of nonzeros in factors  = 740954925
Number of factorization MFLOPS = 515273800=== PARDISO is running in In-Core mode, because iparam(60)=0 ===
Percentage of computed non-zeros for LL^T factorization
 0 %  1 %  2 %  3 %  4 %  5 %  6 %  7 %  8 %  9 %  10 %  11 %  12 %  13 %  14 %  15 %  16 %  17 %  18 %  19 %  20 %  21 %  22 %  23 %  24 %  25 %  26 %  27 %  28 %  29 %  30 %  31 %  32 %  %  34 %  35 %  36 %  37 %  38 %  39 %  40 %  41 %  42 %  43 %  44 %  45 %  46 %  47 %  49 %  50 %  51 %  52 %  53 %  54 %  55 %  56 %  57 %  58 %  59 %  60 %  61 %  62 %  63 %  64 %  65 %  %  68 %  69 %  70 %  71 %  72 %  74 %  75 %  76 %  77 %  78 %  80 %  81 %  82 %  83 %  84 %  85 %  86 %  87 %  88 %  89 %  90 %  91 %  92 %  93 %  94 %  95 %  96 %  97 %  98 %  99 %  100 %

=== PARDISO: solving a Hermitian indefinite system ===
Single-level factorization algorithm is turned ON


Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000000 s
Time spent in factorization step (numfct)                        : 9888.516431 s
Time spent in allocation of internal data structures (malloc)    : 0.002765 s
Time spent in additional calculations                            : 0.000020 s
Total time spent                                                 : 9888.519216 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 15
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           640000
             number of non-zeros in A:      23552000
             number of non-zeros in A (%): 0.005750

             number of right-hand sides:    2

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    181459
             size of largest supernode:               46400
             number of non-zeros in L:                5035922220
             number of non-zeros in U:                1
             number of non-zeros in L+U:              5035922221
             gflop   for the numerical factorization: 515273.800873

             gflop/s for the numerical factorization: 52.108302


Factorization completed ...

=== PARDISO: solving a Hermitian indefinite system ===


Summary: ( solution phase )
================

Times:
======
Time spent in direct solver at solve step (solve)                : 145.461488 s
Time spent in additional calculations                            : 292.539900 s
Total time spent                                                 : 438.001388 s

Statistics:
===========
< Parallel Direct Factorization with number of processors: > 15
< Numerical Factorization with BLAS3 and O(n) synchronization >

< Linear system Ax = b >
             number of equations:           640000
             number of non-zeros in A:      23552000
             number of non-zeros in A (%): 0.005750

             number of right-hand sides:    2

< Factors L and U >
             number of columns for each panel: 96
             number of independent subgraphs:  0
< Preprocessing with state of the art partitioning metis>
             number of supernodes:                    181459
             size of largest supernode:               46400
             number of non-zeros in L:                5035922220
             number of non-zeros in U:                1
             number of non-zeros in L+U:              5035922221
             gflop   for the numerical factorization: 515273.800873

             gflop/s for the numerical factorization: 52.108302


[Thread 15059 (LWP 15059) exited] with exit status 0
[Thread 15060 (LWP 15060) exited] with exit status 0
[Thread 15084 (LWP 15084) exited] with exit status 0
[Thread 15085 (LWP 15085) exited] with exit status 0
[Thread 15086 (LWP 15086) exited] with exit status 0
[Thread 15087 (LWP 15087) exited] with exit status 0
[Thread 15088 (LWP 15088) exited] with exit status 0
[Thread 15089 (LWP 15089) exited] with exit status 0
[Thread 15121 (LWP 15121) exited] with exit status 0
[Thread 15122 (LWP 15122) exited] with exit status 0
[Thread 15123 (LWP 15123) exited] with exit status 0
[Thread 15124 (LWP 15124) exited] with exit status 0
[Thread 15149 (LWP 15149) exited] with exit status 0
[Thread 15150 (LWP 15150) exited] with exit status 0
[Thread 15058 (LWP 15058) exited] with exit status 0
Program exited normally.
(idb) =>> PBS: job killed: ncpus 3891.0 exceeded limit 15 (burst)
Terminated

Koshkarev_A_
Beginner
1,864 Views

ILP64 just run with iparm's that caused fault just run ok, running now the same with LP64 to make sure

Koshkarev_A_
Beginner
1,878 Views

LP64 runs ok too, just cant stand why

John48
Beginner
1,748 Views

Hi,

I also get a Segmentation error with PARDISO please see the bottom part of the output from running the simple symmetrical case:

< Parallel Direct Factorization with #cores: > 1
< and #nodes: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 8

#non-zeros in A: 18
non-zeros in A (%): 28.125000
#right-hand sides: 1

< Factors L and U >
#columns for each panel: 80
# of independent subgraphs: 0
< preprocessing with state of the art partitioning metis>
#supernodes: 5
size of largest supernode: 4
number of nonzeros in L 29
number of nonzeros in U 1
number of nonzeros in L+U 30
number of perturbed pivots 0
number of nodes in solve 8
Gflop for the numerical factorization: 0.000000
Gflop/s for the numerical factorization: 0.000067

Factorization completed ...
./runfile: line 14: 23676 Segmentation fault (core dumped) ./pardiso_sym

This is my compilation command:
gcc -g -o pardiso_sym pardiso_sym.c -L. -lpardiso600-GNU800-X86-64 -llapack -lrefblas -lgfortran -fopenmp -lpthread -lstdc++ -lm

Please help

Kirill_V_Intel
Employee
1,252 Views

Hi John!

Please become aware that you were not using Intel(R) oneMKL PARDISO. You have used PARDISO Project which is not Intel's product. Back 15+ years ago it was the same, but today these two are completely different software products although similarity of the API is still present (yet there are difference.

Hence your question is not for this forum. 

Of course, if you try PARDISO from MKL/oneMKL and encounter any issue, don't hesitate to ask here.

I hope this helps.

Best,
Kirill

 

John48
Beginner
1,188 Views

Hi Krill,

 

Thank you for your reply.

 

I know that I am not using MKL but I was just hoping that someone might help as I have run up against a brick wall with this – PARDISO have not been much help.

 

Best regards,

John

mecej4
Black Belt
1,182 Views

From the program output, it appears that the seg-fault occurred after Pardiso-6 completed the factorization of the supplied sparse matrix, and that the cause of the seg-fault is one of the lines in your code that comes (in logical order) after the factorization call.

Since you have not shown the source code, nobody can help you at this point.

John48
Beginner
1,174 Views

Hi Black Belt

Thank you for your reply.

The code is PARDISO's downloadable pardiso_sym - the simplest initial test program.  I have assumed this must be bug free and that the problem(s) must be with my linking but I can let you have a copy if you would like me to?

Many thanks,

John

 

mecej4
Black Belt
1,159 Views

[NOTE: This post is not about the Pardiso solver that is contained in MKL. It is about the Pardiso solver from https://pardiso-project.org . ]

As of now I do not have access to a Linux system, so here is what I can show to suggest that there is no problem in the Pardiso-6 libraries or the pardiso_sym.f example.

Gfortran frowns on passing scalar arguments where array arguments are expected, even dummy arguments that will never be accessed in a particular call. This is easily remedied by changing the declarations of IDUM and DUM to IDUM(1) and DUM(1).

I have the Pardiso-6 libraries on my Windows10 - x64 system. I built the pardiso_sym example using

  • the library for Intel Fortran, libpardiso600-WIN-X86-64.dll, with Ifort 2021.1.2, 
  • the library for Mingw, libpardiso600-WIN-X86-64-MINGW.dll, with Gfortran 10.2 for Cygwin 64 on Windows 10.
  • the same Mingw library, but using Gcc 10.2 for Cygwin 64 on Windows 10, using the unmodified C example file, pardiso_sym.c .

Both Fortran versions ran to completion normally, ending the output with the eight diagonal elements of the inverse of the matrix A. I show below a part of the output, and this differs from what you showed but the outputs from Ifort and Gfortran-Cygwin agree (ignore the Gflops results, they cannot be expected to agree). (I glanced only briefly at the C program output, which was a bit different, but I believe that the b vector has different values than in the Fortran example. The C program also ran to completion without any access violations.)

 

 

 

 

Statistics:
===========
 < Parallel Direct Factorization with #cores: >              1
 <                                and #nodes: >              1
 < Numerical Factorization with Level-3 BLAS performance >

 < Linear system Ax = b>
             #equations:                                     8
             #non-zeros in A:                                18
             non-zeros in A (%):                             28.125000
             #right-hand sides:                              1

< Factors L and U >
             #columns for each panel:                        80
             # of independent subgraphs:                     0
 < preprocessing with state of the art partitioning metis>
             #supernodes:                                    4
             size of largest supernode:                      4
             number of nonzeros in L                         31
             number of nonzeros in U                         1
             number of nonzeros in L+U                       32
             number of perturbed pivots                      1
             number of nodes in solve                        8
             Gflop   for the numerical factorization:        0.000000
             Gflop/s for the numerical factorization:        0.000520
 Log of determinant is     15.506631679957248
 Factorization completed ...
[PARDISO] ref_it: 1. rel res   0.372E-07 rel err   0.834E-07

 

 

 

 

Try compiling and linking with the Gfortran options -g -fbacktrace, and then running the resulting program. Doing so may give you more information on the problem that you ran into.

John48
Beginner
1,146 Views

Hi Black Belt,

Thank you for your reply but I get:

gcc: error: fbacktrace: No such file or directory

if I use -g fbacktrace (or -g -fbacktrace).

Is there a way I can get the code to you?

Regards,

John

mecej4
Black Belt
1,123 Views

You can zip the code (plus any data files, header files), and attach the zip file to your reply. Or, you can upload the zip file to your Dropbox, Onedrive or GoogleDrive, give public access to the uploaded file, and provide a link to it in your reply.

If you do not wish to make the files available to everyone, let me know, and I shall send you a private mail on this server, and you can attach the file to your (private) reply to my message.

John48
Beginner
1,113 Views

Hi Black Belt

Please see below.

Many thanks,

John

/* -------------------------------------------------------------------- */
/* Example program to show the use of the "PARDISO" routine */
/* on symmetric linear systems */
/* -------------------------------------------------------------------- */
/* This program can be downloaded from the following site: */
/* http://www.pardiso-project.org */
/* */
/* (C) Olaf Schenk, Institute of Computational Science */
/* Universita della Svizzera italiana, Lugano, Switzerland. */
/* Email: olaf.schenk@usi.ch */
/* -------------------------------------------------------------------- */

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

 

/* PARDISO prototype. */
void pardisoinit (void *, int *, int *, int *, double *, int *);
void pardiso (void *, int *, int *, int *, int *, int *,
double *, int *, int *, int *, int *, int *,
int *, double *, double *, int *, double *);
void pardiso_chkmatrix (int *, int *, double *, int *, int *, int *);
void pardiso_chkvec (int *, int *, double *, int *);
void pardiso_printstats (int *, int *, double *, int *, int *, int *,
double *, int *);


int main( void )
{
/* Matrix data. */
int n = 8;
int ia[ 9] = { 0, 4, 7, 9, 11, 14, 16, 17, 18 };
int ja[18] = { 0, 2, 5, 6,
1, 2, 4,
2, 7,
3, 6,
4, 5, 6,
5, 7,
6,
7 };
double a[18] = { 7.0, 1.0, 2.0, 7.0,
-4.0, 8.0, 2.0,
1.0, 5.0,
7.0, 9.0,
5.0, 1.0, 5.0,
0.0, 5.0,
11.0,
5.0 };


int nnz = ia[n];
int mtype = -2; /* Real symmetric matrix */

/* RHS and solution vectors. */
double b[8], x[8];
int nrhs = 1; /* Number of right hand sides. */

/* Internal solver memory pointer pt, */
/* 32-bit: int pt[64]; 64-bit: long int pt[64] */
/* or void *pt[64] should be OK on both architectures */
void *pt[64];

/* Pardiso control parameters. */
int iparm[64];
double dparm[64];
int maxfct, mnum, phase, error, msglvl, solver;

/* Number of processors. */
int num_procs;

/* Auxiliary variables. */
char *var;
int i, k;

double ddum; /* Double dummy */
int idum; /* Integer dummy. */


/* -------------------------------------------------------------------- */
/* .. Setup Pardiso control parameters. */
/* -------------------------------------------------------------------- */

error = 0;
solver=0;/* use sparse direct solver */
pardisoinit (pt, &mtype, &solver, iparm, dparm, &error);

if (error != 0)
{
if (error == -10 )
printf("No license file found \n");
if (error == -11 )
printf("License is expired \n");
if (error == -12 )
printf("Wrong username or hostname \n");
return 1;
}
else
printf("[PARDISO]: License check was successful ... \n");

/* Numbers of processors, value of OMP_NUM_THREADS */
var = getenv("OMP_NUM_THREADS");
if(var != NULL)
sscanf( var, "%d", &num_procs );
else {
printf("Set environment OMP_NUM_THREADS to 1");
exit(1);
}
iparm[2] = num_procs;

maxfct = 1; /* Maximum number of numerical factorizations. */
mnum = 1; /* Which factorization to use. */

msglvl = 1; /* Print statistical information */
error = 0; /* Initialize error flag */

/* -------------------------------------------------------------------- */
/* .. Convert matrix from 0-based C-notation to Fortran 1-based */
/* notation. */
/* -------------------------------------------------------------------- */
for (i = 0; i < n+1; i++) {
ia[i] += 1;
}
for (i = 0; i < nnz; i++) {
ja[i] += 1;
}

/* Set right hand side to i. */
for (i = 0; i < n; i++) {
b[i] = i;
}

/* -------------------------------------------------------------------- */
/* .. pardiso_chk_matrix(...) */
/* Checks the consistency of the given matrix. */
/* Use this functionality only for debugging purposes */
/* -------------------------------------------------------------------- */

pardiso_chkmatrix (&mtype, &n, a, ia, ja, &error);
if (error != 0) {
printf("\nERROR in consistency of matrix: %d", error);
exit(1);
}

/* -------------------------------------------------------------------- */
/* .. pardiso_chkvec(...) */
/* Checks the given vectors for infinite and NaN values */
/* Input parameters (see PARDISO user manual for a description): */
/* Use this functionality only for debugging purposes */
/* -------------------------------------------------------------------- */

pardiso_chkvec (&n, &nrhs, b, &error);
if (error != 0) {
printf("\nERROR in right hand side: %d", error);
exit(1);
}

/* -------------------------------------------------------------------- */
/* .. pardiso_printstats(...) */
/* prints information on the matrix to STDOUT. */
/* Use this functionality only for debugging purposes */
/* -------------------------------------------------------------------- */

pardiso_printstats (&mtype, &n, a, ia, ja, &nrhs, b, &error);
if (error != 0) {
printf("\nERROR right hand side: %d", error);
exit(1);
}

/* -------------------------------------------------------------------- */
/* .. Reordering and Symbolic Factorization. This step also allocates */
/* all memory that is necessary for the factorization. */
/* -------------------------------------------------------------------- */
phase = 11;

pardiso (pt, &maxfct, &mnum, &mtype, &phase,
&n, a, ia, ja, &idum, &nrhs,
iparm, &msglvl, &ddum, &ddum, &error, dparm);

if (error != 0) {
printf("\nERROR during symbolic factorization: %d", error);
exit(1);
}
printf("\nReordering completed ... ");
printf("\nNumber of nonzeros in factors = %d", iparm[17]);
printf("\nNumber of factorization MFLOPS = %d", iparm[18]);

/* -------------------------------------------------------------------- */
/* .. Numerical factorization. */
/* -------------------------------------------------------------------- */
phase = 22;
iparm[32] = 1; /* compute determinant */

pardiso (pt, &maxfct, &mnum, &mtype, &phase,
&n, a, ia, ja, &idum, &nrhs,
iparm, &msglvl, &ddum, &ddum, &error, dparm);

if (error != 0) {
printf("\nERROR during numerical factorization: %d", error);
exit(2);
}
printf("\nFactorization completed ...\n ");

/* -------------------------------------------------------------------- */
/* .. Back substitution and iterative refinement. */
/* -------------------------------------------------------------------- */
phase = 33;

iparm[7] = 1; /* Max numbers of iterative refinement steps. */

pardiso (pt, &maxfct, &mnum, &mtype, &phase,
&n, a, ia, ja, &idum, &nrhs,
iparm, &msglvl, b, x, &error, dparm);

if (error != 0) {
printf("\nERROR during solution: %d", error);
exit(3);
}

printf("\nSolve completed ... ");
printf("\nThe solution of the system is: ");
for (i = 0; i < n; i++) {
printf("\n x [%d] = % f", i, x[i] );
}
printf ("\n\n");


/* -------------------------------------------------------------------- */
/* ... Inverse factorization. */
/* -------------------------------------------------------------------- */

if (solver == 0)
{
printf("\nCompute Diagonal Elements of the inverse of A ... \n");
phase = -22;
iparm[35] = 1; /* no not overwrite internal factor L */

pardiso (pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, &idum, &nrhs,
iparm, &msglvl, b, x, &error, dparm);

/* print diagonal elements */
for (k = 0; k < n; k++)
{
int j = ia[k]-1;
printf ("Diagonal element of A^{-1} = %d %d %32.24e\n", k, ja[j]-1, a[j]);
}

}


/* -------------------------------------------------------------------- */
/* .. Convert matrix back to 0-based C-notation. */
/* -------------------------------------------------------------------- */
for (i = 0; i < n+1; i++) {
ia[i] -= 1;
}
for (i = 0; i < nnz; i++) {
ja[i] -= 1;
}

/* -------------------------------------------------------------------- */
/* .. Termination and release of memory. */
/* -------------------------------------------------------------------- */
phase = -1; /* Release internal memory. */

pardiso (pt, &maxfct, &mnum, &mtype, &phase,
&n, &ddum, ia, ja, &idum, &nrhs,
iparm, &msglvl, &ddum, &ddum, &error, dparm);

return 0;
}

mecej4
Black Belt
1,172 Views

John48, I am sorry for causing some confusion by using the Fortran example pardiso_sym.f instead of using the C example pardiso_sym.c in obtaining the results that I wrote about earlier.

Here I have attached a Zip files containing the following files:

  1. The source file j4.c, which is what you posted in-line in your latest message, and is the same as pardiso_sym.c with leading blanks striped off,
  2. The results of running the program,  file ICCP5.txt, after compiling j4.c using Intel C 2021.1.2 and linking with the PARDISO-5 library for 64-bit Windows,  libpardiso500-WIN-X86-64,
  3. The results file ICCP6.txt, same as 2. but using the PARDISO-6 library, libpardiso600-WIN-X86-64
  4. The results file GCCP6.txt, same as 3 but using GCC-10.2 for Cygwin-64 on Windows and libpardiso600-WIN-X86-64-MINGW. 

A you can see from the results, all three results agree quite well, if you ignore the timing and Gflops outputs, which are expected to be affected by the compiler and compiler-options used.

Reply