Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

PARDISO segmentation fault

Koshkarev_A_
Beginner
7,508 Views

idbc wrote after 80% of LL' factorization:

Program received signal SIGSEGV
mkl_blas_mc_sgem2vu_odd () in /mnt/storage/opt/intel/composer_xe_2013_sp1.0.080/mkl/lib/intel64/libmkl_mc.so

in the attachment there is matrix with the program and makefile to reproduce this fault.

Matrix is CSR 3-array-variation 1-based (Upper triangle part of hermitian matrix) with about 22 000 000 nonzeros and 64000x64000 size

The same program with smaller size worked, max size tested 17280x17280.

The program executed on the: MACHTYPE=x86_64-suse-linux; HP DL580 G5 with 4x Intel Xeon 7350

0 Kudos
1 Solution
Kirill_V_Intel
Employee
6,072 Views

Hi John!

Of course I have a personal bias but I believe you would get a more decent support in case you start using Intel oneMKL PARDISO.

I am pretty sure that you will not have an issue like you described if you follow the described ways (like how to compile and link your code with oneMKL, e.g. from here https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl/link-line-advisor.html). Even if smth goes wrong, on this forum you'd get an answer about what is wrong or how to do it properly.

Best,
Kirill

View solution in original post

0 Kudos
46 Replies
John48
Beginner
2,540 Views

Hi Black Belt,

Thank you for this; presumably, your work proves that I have compiler or linking (as I suspect) issues?

Did you use the same matrix problem as in the code I sent you as the results are not the same as in the manual - please see attached pages 43-46.

Many thanks again,

John

 

0 Kudos
mecej4
Honored Contributor III
2,513 Views

John48 wrote: "Did you use the same matrix problem as in the code I sent you as the results are not the same as in the manual - please see attached pages 43-46."

I did, but there are inconsistencies in the matrix and vector values between the source code listings in the manual and the file pardiso_sym.c from the Pardiso-project.org site. Check the element in row 5, col 6, i.e.,  A(5,6) (= +1 or -1?) and the values in the array b ( 0 to 7, or 1 to 8?).

0 Kudos
John48
Beginner
2,481 Views

Thank you Black Belt for all of your help.  

I am trying the MKL approach.

Best regards,

John

0 Kudos
John48
Beginner
2,432 Views

Hi Black Belt,

If I run j4.c I get the output below which, by looking at your output files, is not the same as when you run it.

Please let me have any ideas you may have.

Many thanks,

John

with e.g. export PARDISOLICMESSAGE=1
***************************************************************************
[PARDISO]: License check was successful ...
[PARDISO]: Matrix type : real symmetric
[PARDISO]: Matrix dimension : 8
[PARDISO]: Matrix non-zeros : 18
[PARDISO]: Abs. coeff. range: min 0.00e+00 max 1.10e+01
[PARDISO]: RHS no. 1: min 0.00e+00 max 7.00e+00

================ PARDISO: solving a symmetric indef. system ================


Summary PARDISO 6.0.0: ( reorder to reorder )
=======================

Times:
======

Time fulladj: 0.000013 s
Time reorder: 0.000180 s
Time symbfct: 0.000066 s
Time parlist: 0.000009 s
Time malloc : -0.000285 s
Time total : 0.000793 s total - sum: 0.000810 s

Statistics:
===========
< Parallel Direct Factorization with #cores: > 1
< and #nodes: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 8
#non-zeros in A: 18
non-zeros in A (%): 28.125000
#right-hand sides: 1

< Factors L and U >
#columns for each panel: 80
# of independent subgraphs: 0
< preprocessing with state of the art partitioning metis>
#supernodes: 5
size of largest supernode: 4
number of nonzeros in L 29
number of nonzeros in U 1
number of nonzeros in L+U 30
number of perturbed pivots 0
number of nodes in solve 8
Gflop for the numerical factorization: 0.000000

Reordering completed ...
Number of nonzeros in factors = 30
Number of factorization MFLOPS = 0
================ PARDISO: solving a symmetric indef. system ================


Summary PARDISO 6.0.0: ( factorize to factorize )
=======================

Times:
======

Time A to LU: 0.000001 s
Time numfct : 0.000069 s
Time malloc : -0.000661 s
Time total : 0.000774 s total - sum: 0.001365 s

Statistics:
===========
< Parallel Direct Factorization with #cores: > 1
< and #nodes: > 1
< Numerical Factorization with Level-3 BLAS performance >

< Linear system Ax = b>
#equations: 8
#non-zeros in A: 18
non-zeros in A (%): 28.125000
#right-hand sides: 1

< Factors L and U >
#columns for each panel: 80
# of independent subgraphs: 0
< preprocessing with state of the art partitioning metis>
#supernodes: 5
size of largest supernode: 4
number of nonzeros in L 29
number of nonzeros in U 1
number of nonzeros in L+U 30
number of perturbed pivots 0
number of nodes in solve 8
Gflop for the numerical factorization: 0.000000
Gflop/s for the numerical factorization: 0.001076

Factorization completed ...
./runfile: line 14: 2802 Segmentation fault (core dumped) ./pardiso_sym
root@DESKTOP-8HR6ET2:/mnt/c/PARDISO# cp j4.c pardiso_sym.c

0 Kudos
mecej4
Honored Contributor III
2,423 Views

The discrepancy is caused by setting the R.H.S. vector equal to [0, 1, 2, ..., 7] in j4.c instead of [1, 2, 3, ..., 8]; the latter is what you can see on p.43 of the Pardiso 7.2 manual.

0 Kudos
John48
Beginner
2,416 Views

Hi Black Belt,

Thank you for this but I was under the assumption that j4.c was the code you used for your runs?  Perhaps, I have misunderstood you?

I put b[] = i+1; in the code but it made no difference to my result.  It seems that the main matrix is causing the problems?

Regards,

John

0 Kudos
mecej4
Honored Contributor III
2,407 Views

Please attach a zip of the actual source file that you are using, and state the version of Pardiso that you are using. This thread is now rather long, and we now have too many versions of the code that you and I could have in mind, causing confusion.

Unlike you, I did not see access violations, only different results because of the different RHS vectors.

0 Kudos
John48
Beginner
2,392 Views

Hi Black Belt,

Please see attached which should be the j4.c you sent me but with b[i]=i+1 as discussed.

Regards,

John

0 Kudos
mecej4
Honored Contributor III
2,382 Views

As I wrote previously, you have to set A(5,6) (in C notation, A[12]) to -1, rather than +1, to make the code match the manual. On line 47 of the source (counting starting with 1, not 0!), change "1" to "-1".

0 Kudos
John48
Beginner
2,368 Views

Hi Black Belt

OK thanks; my output now agrees with the manual but not with yours.

It also still segments as before - at the solve to solve stage.  Could this be a linking problem?

Thank you for your help and patience.

Regards,

John

 

0 Kudos
John48
Beginner
2,359 Views

Hi Black Belt

Would you mind please letting me have your program which we know works?

Thank you for your help again.

Regards,

John

0 Kudos
mecej4
Honored Contributor III
2,387 Views

Program source and results from Pardiso 6 are attached.

0 Kudos
John48
Beginner
2,379 Views

Can't see an attachment.

John

0 Kudos
John48
Beginner
2,356 Views

Thanks Black Belt but the job still crashes at the same place.

The following is a list of my compiler statement: 

gcc -g -o pardiso_sym pardiso_sym.c -L. -lpardiso600-GNU800-X86-64 -llapack -lrefblas -lgfortran -fopenmp -lpthread -lstdc++ -lm

Do you think the problem could be due to the one zero diagonal element?

Best regards,

John

 

0 Kudos
mecej4
Honored Contributor III
2,348 Views

No, I do not think that the matrix entries lead to any problem at all, because the same problem runs fine for me on Windows using two different versions of Lugano Pardiso (5 and 6), as well as the Pardiso in MKL, using drivers written in Fortran as well as C.

I do not have access to a Linux system currently, so I cannot try and see what the problems may be when the same program is compiled and run on Linux using gcc.

0 Kudos
John48
Beginner
2,333 Views

Thanks again Black Belt.

It would be good if I could have outputs of the values in the matrices during your run to compare with mine.  Can this be done easily?

Regards,

John

0 Kudos
mecej4
Honored Contributor III
2,324 Views

I already gave you the output printed by the program. If you want to insert additional printf statements, you can do so. However, I think that you are wasting your time doing so, if you want to find what is going wrong and causing an access violation. Nor do I want to become your remote debugger.

Instead, use GDB or another debugger, and find out the line in the program that is causing the access violation.

0 Kudos
John48
Beginner
2,310 Views

Hi Black Belt,

Thank you for all of you have done.  Your work has shown that my problem(s) are most probably due to linking and I will further investigate your output.

I did try the MKL option but the software would not download .

Best regards,

John

 

0 Kudos
Kirill_V_Intel
Employee
2,573 Views

Hi John,

So is there any reason you don't want to try oneMKL PARDISO?   

Best,
Kirill

0 Kudos
John48
Beginner
2,536 Views

Hi Krill,

I could try if you think it would help.

Regards,

John

0 Kudos
Kirill_V_Intel
Employee
6,073 Views

Hi John!

Of course I have a personal bias but I believe you would get a more decent support in case you start using Intel oneMKL PARDISO.

I am pretty sure that you will not have an issue like you described if you follow the described ways (like how to compile and link your code with oneMKL, e.g. from here https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/onemkl/link-line-advisor.html). Even if smth goes wrong, on this forum you'd get an answer about what is wrong or how to do it properly.

Best,
Kirill

0 Kudos
Reply