Issue in cluster_sparse_solver examples

Mansouri__Nima · ‎07-24-2018

Hello,

I have an issue when using Parallel Direct Sparse Solver for Clusters. I modified a test case sample (cl_solver_sym_sp_0_based_c.c) in "mkl/example/examples_cluster_c.tgz" routines which includes a system of linear equation (AX=b) with a sparse symmetric matrix (A).

When I use the original file and run the code with different number of cores the code PASSES the test. But, when I modify just a member of A matrix (REPLACE THE DIAGONAL ELEMENT IN 2nd ROW from -4.0 to 0.0), the code fails when I use multiple cores. It works using 1 core. (I attached the modified file to this post)

The original shape of matrix is:

    float a[18] = { 7.0, /*0*/ 1.0, /*0*/ /*0*/ 2.0, 7.0, /*0*/
                         -4.0, 8.0, /*0*/ 2.0, /*0*/ /*0*/ /*0*/
                             1.0, /*0*/ /*0*/ /*0*/ /*0*/ 5.0,
                                   7.0, /*0*/ /*0*/ 9.0, /*0*/
                                            5.0, 1.0, 5.0, /*0*/
                                             -1.0, /*0*/ 5.0,
                                                        11.0, /*0*/
                                                         5.0

The modified

    float a[18] = { 7.0, /*0*/ 1.0, /*0*/ /*0*/ 2.0, 7.0, /*0*/
                         -0.0, 8.0, /*0*/ 2.0, /*0*/ /*0*/ /*0*/
                             1.0, /*0*/ /*0*/ /*0*/ /*0*/ 5.0,
                                   7.0, /*0*/ /*0*/ 9.0, /*0*/
                                            5.0, 1.0, 5.0, /*0*/
                                             -1.0, /*0*/ 5.0,
                                                        11.0, /*0*/
                                                         5.0

I am using these routines for couples of days and I may miss something.

Thank you for your help.

Nima Mansouri

Ying_H_Intel · ‎07-25-2018

Hi Nima ,

The problem looks familiar. What is your MKL version? we will check it and get back to you soon.

https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2018-bug-fixes-list

Thanks

Ying

Mansouri__Nima · ‎07-25-2018

Hi Ying,

Thank you for your response. I used the following details (about machine, compile, MPI) to run the test cases:

-Installed Intel libraries on machine is 18.0.0.128.

-Operating System: Linux 4.4.0-130-generic #156-Ubuntu x86_64 GNU/Linux

-Intel compiler

-Open MPI 2.1.1

Thank you!

Nima

Alexander_K_Intel2 · ‎08-27-2018

Hi Nima,

Thanks for reporting this issue. In case of zero element on diagonal cluster_sparse_solver got pivot during factorization that resulted in modified factorized matrix. Automatic iterative algorithm on solving step in most of cases improve residual but for several matrices it doesn't work (exactly this case). This issue appeared sometimes on systems with small matrix. However that's not issue of implementation, the bottleneck is sparse portrait of matrix and impossibility of applying full pivoting algorithm on factorization. Moreover MKL pardiso and MKL direct solver for cluster use different technique on reordering which lead to different portrait of factorized matrix and different number of pivots.

Thanks,

Alex