cluster_sparse_solver discrepancy

asd__asdqwe · ‎11-07-2014

Hello,

I'm trying to solve a general system with CPARDISO. When using two processes, there is no issue if I don't use the coefficient array during the solution phase. When using only one process, then I get a segmentation fault. Could you give me some insight into this issue, please ? Thank you in advance.

$ mpicxx -cxx=icpc cl_solver_unsym_complex_c.cpp -lmkl_intel_thread -lmkl_core -lmkl_intel_lp64 -liomp5 -std=c++11
$ mpirun -np 1 ./a.out
$ echo $?
11
$ mpirun -np 2 ./a.out
$ echo $?
0
$ mpicxx -cxx=icpc cl_solver_unsym_complex_c.cpp -lmkl_intel_thread -lmkl_core -lmkl_intel_lp64 -liomp5 -std=c++11 -DNSEGFAULT
$ mpirun -np 1 ./a.out
$ echo $?
0
$ mpirun -np 2 ./a.out
$ echo $?
0

Gennady_F_Intel · ‎11-07-2014

what version of mkl you are using? please look at mklsupport.txt and give the Package ID.

how about 4 or 8 processes?

asd__asdqwe · ‎11-07-2014

Here are the headers from mkl.h (I don't know where the file mklsupport.txt is).

#define __INTEL_MKL_BUILD_DATE 20140723

#define __INTEL_MKL__ 11
#define __INTEL_MKL_MINOR__ 2
#define __INTEL_MKL_UPDATE__ 0

I tried on a larger problem, there was no segfault on 4 or 8 cores, but this time, it was also segfaulting with 2 cores (and 1, just as before).

I edited my answer and I attached a larger test case: no problem with 4 or 8 cores. Segfault with 1 and 2 cores when NSEGFAULT is not defined. Segfault with 2 cores even when NSEGFAULT is defined. Thank you for looking.

Roman_A_Intel · ‎11-11-2014

Hello qweasd,

You have to always put matrix values for solution step (phase=33) in case of iterative refinement (iparm[7] != 0) and also for matrix type -2, -4, 6, 11, 13, 1, 3, because in this cases there can appear zero pivots during numerical factorization and iterative refinement will turn on.

The matrix provided by you has zero pivots, so it has segfault when Pardiso try to get matrix values and run iterative refinement step.

Regards,

Roman

asd__asdqwe · ‎11-11-2014

Hello Roman,

In my case, iparm[7] == 0, and output integer error is equal to 0, not -4, after factorization. So is it possible to really force no iterative refinement ? How can I know if PARDISO will need to perform iterative refinement even if iparm[7] is set to 0 by the user ? Also, could you comment on the fact that iterative refinement seems more likely to be performed when less processors are used during numerical factorization ?

Thank you.

Roman_A_Intel · ‎11-12-2014

Please, see my comments below.

1) To avoid automatically iterative refinement performed by Pardiso you can use iparm[20] =2 (Note from documentation: Apply 1x1 diagonal pivoting during the factorization process. Using this value is the same as using iparm(21) = 0 except that the solve step does not automatically make iterative refinements when perturbed pivots are obtained during numerical factorization. The number of iterations is limited to the number of iterative refinements specified byiparm(8) (0 by default).)

or iparm[20] = 3 (Note from documentation: Apply 1x1 and 2x2 Bunch-Kaufman pivoting during the factorization process. Bunch-Kaufman pivoting is available for matrices of mtype=-2, mtype=-4, or mtype=6. Using this value is the same as using iparm(21) = 1 except that the solve step does not automatically make iterative refinements when perturbed pivots are obtained during numerical factorization. The number of iterations is limited to the number of iterative refinements specified by iparm(8) (0 by default).)

2) If CPARDISO is used with distributed input format data (iparm[39] != 0) and the number of MPI processes is more than 1 then CPARDISO gets matrix values for iterative refinement from our internal structure. So it is not needed to put matrix values for solving step. In case of 1 MPI process you have to provide matrix values on solving step, because CPARDISO do not have its in internal structure.

Regards,

Roman

asd__asdqwe · ‎11-12-2014

Thank you for the in-depth answer ! Last question, in the second test case (attached here https://software.intel.com/en-us/forums/topic/535078#comment-1804141), whether I compile with -DNSEGFAULT or not, mpirun -np 2 ./a.out always segfaults during the solution phase. Could you comment on that please ?

Thanks again !

Roman_A_Intel · ‎11-12-2014

I have reproduced problem in the second test. We will investigate it more deeply. Thanks for this testcase.

Regards,

Roman

Roman_A_Intel · ‎11-16-2014

Hi,

This problem was resolved in MKL11.2 update 1. Please, see this topic for details: Intel® Math Kernel Library 11.2 Update 1 is now available (https://software.intel.com/en-us/forums/topic/535441)

Regards,

Roman

asd__asdqwe · ‎11-17-2014

Hello,

I've downloaded the new update and I still get the exact same segmentation fault on the 2nd example. What should I do ?

asd__asdqwe · ‎12-01-2014

Could you confirm that you can reproduce the segmentation fault with the 2nd example please ? Thank you.

Roman_A_Intel · ‎12-01-2014

Hi,

Sorry for long delay. We see the issue, it is not depend on number of threads and appeared occasionally - for this reason I did not see it in previous reply. Currently we are working on it - I will inform you when it is resolved.

Regards,

Roman

Gennady_F_Intel · ‎02-09-2015

pls check the issue with the latest 11.2 Update 2 which was released the last week and let us know the results. thanks.