running "cl_solver_unsym_c.c"

Arrigoni__Viviana · ‎04-09-2019

Hi.

I need to run the code in examples/cluster_sparse_solverc/source of the Intel installation directory with different input data.
Just as in the example, I am using CSR sparse format. Arrays a, ja and ia are read from files that I generated, and they are as follows:

a : double array of 52 nonzero elements.

ja : 1 2 3 12 1 2 3 9 1 2 3 4 5 4 5 6 12 4 5 6 8 2 4 5 6 11 7 8 9 1 7 8 9 2 7 8 9 11 12 5 6 10 11 12 1 10 11 12 8 10 11 12

ja = column index of a, counting from 1.

ia : 1 5 9 14 18 22 27 30 34 40 45 49 53

assuming that the full matrix is 12x12, ia = pointer to the first element of the i-th row. ia[12] = 53, that is the number of elements of a and ja + 1.

when I execute it, I get the following message that appear in phase 22:

*** Error in PARDISO (incorrect input matrix ) error_num= 21

*** Input check: i=12, ia(i)=49, ia(i+1)=53 are incompatible

ERROR during symbolic factorization: -1

I don't see where is the mistake here. Why are those value of ia incompatible? Apart from reading these files, the only thing I changed was n, that is now 12.
Thank you in advance.

mecej4 · ‎04-09-2019

Using your verbal description, I modified the example solverc\pardiso_unsym_c.c with your matrix data, and the example ran with no error -- I do not have a cluster and I have not installed MPI.

Please provide your source file as an attachment (after zipping).

Arrigoni__Viviana · ‎04-09-2019

Here's the source file together with the data files.

mecej4 · ‎04-09-2019

Here is a version of your program, with all MPI-related lines removed. It ran without any error messages. If you print the solution x, and the values appear reasonable, you may proceed to investigate why the MPI version fails. As I said earlier, I do not have MPI installed.

Gennady_F_Intel · ‎04-09-2019

Viviana, I checked your case with the latest MKL 2019 u3, 8 mpi process. Intel MPI. Everything is going well.

here is the log I see on my part.

$ mpirun -n 8 ./a.out

=== CPARDISO: solving a real nonsymmetric system ===
1-based array indexing is turned ON
CPARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON

Summary: ( reordering phase )
================

Times:
======
Time spent in calculations of symmetric matrix portrait (fulladj): 0.000008 s
Time spent in reordering of the initial matrix (reorder) : 0.000116 s
Time spent in symbolic factorization (symbfct) : 0.000081 s
Time spent in data preparations for factorization (parlist) : 0.000002 s
Time spent in allocation of internal data structures (malloc) : 0.000222 s
Time spent in additional calculations : 0.000023 s
Total time spent : 0.000452 s

Statistics:
===========
Parallel Direct Factorization is running on 8 MPI and 1 OpenMP per MPI process

< Linear system Ax = b >
number of equations: 12
number of non-zeros in A: 52
number of non-zeros in A (%): 36.111111

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 5
size of largest supernode: 7
number of non-zeros in L: 77
number of non-zeros in U: 21
number of non-zeros in L+U: 98
phase 11 ok

Reordering completed ... phase 11 ok
phase 11 ok
phase 11 ok
phase 11 ok
phase 11 ok
phase 11 ok
phase 11 ok

Percentage of computed non-zeros for LL^T factorization
10 % 98 % 100 %

=== CPARDISO: solving a real nonsymmetric system ===
Single-level factorization algorithm is turned ON

Summary: ( factorization phase )
================

Times:
======
Time spent in copying matrix to internal data structure (A to LU): 0.000103 s
Time spent in factorization step (numfct) : 0.012607 s
Time spent in allocation of internal data structures (malloc) : 0.000018 s
Time spent in additional calculations : 0.000000 s
Total time spent : 0.012728 s

Statistics:
===========
Parallel Direct Factorization is running on 8 MPI and 1 OpenMP per MPI process

< Linear system Ax = b >
number of equations: 12
number of non-zeros in A: 52
number of non-zeros in A (%): 36.111111

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 5
size of largest supernode: 7
number of non-zeros in L: 77
number of non-zeros in U: 21
number of non-zeros in L+U: 98
gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000036

phase 22 ok

Factorization completed ...
Solving system...phase 22 ok
phase 22 ok
phase 22 ok
phase 22 ok
phase 22 ok
phase 22 ok
phase 22 ok

=== CPARDISO: solving a real nonsymmetric system ===

Summary: ( solution phase )
================

Times:
======
Time spent in direct solver at solve step (solve) : 0.001186 s
Time spent in additional calculations : 0.002096 s
Total time spent : 0.003282 s

Statistics:
===========
Parallel Direct Factorization is running on 8 MPI and 1 OpenMP per MPI process

< Linear system Ax = b >
number of equations: 12
number of non-zeros in A: 52
number of non-zeros in A (%): 36.111111

number of right-hand sides: 1

< Factors L and U >
number of columns for each panel: 128
number of independent subgraphs: 0
< Preprocessing with state of the art partitioning metis>
number of supernodes: 5
size of largest supernode: 7
number of non-zeros in L: 77
number of non-zeros in U: 21
number of non-zeros in L+U: 98
gflop for the numerical factorization: 0.000000

gflop/s for the numerical factorization: 0.000036

The solution of the system is:
x [0] = 0.210793
x [1] = 0.381213
x [2] = 0.290548
x [3] = 0.575017
x [4] = 0.235506
x [5] = 0.193867
x [6] = 1.235496
x [7] = -0.137621
x [8] = -0.151299
x [9] = 0.544910
x [10] = 0.138011
x [11] = 0.460036
Relative residual = 8.479468e-17

TEST PASSED

Gennady_F_Intel · ‎04-09-2019

and here is how we are compiling:

make
mpicc -I/opt/intel/compilers_and_libraries_2019/linux/mkl/include sparse_linsys_cluster.cpp \
-Wl,--start-group \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_intel_lp64.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_intel_thread.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_core.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.a \
-Wl,--end-group -liomp5 -lpthread -lm -ldl

Arrigoni__Viviana · ‎04-10-2019

Thank you everybody for your help.
I need the program to run in a multi-processors system and to exploit the cluster, so I need it to include all MPI calls.
Also, I need to compare its results with other programs that I wrote and that I always compiled and tested using the intel mpiicc compiler.

In the version that I uploaded, processors would read data from files. I printed the values of the data they read, and they are all correct, but I get the error message that is in the first post of this discussion. If I run the example as it is, with the proposed data, the program works correctly. So I have just tried to initialize a, ja and ia arrays directly in the code, with the data that are in the file, and in this way it works correctly (meaning that the csr format that I am providing is correct, and it shouldn't be "incompatible"), but I need to run it with very bigger data sizes, and I just can't copy-paste them in the code: they must be read from files. What goes wrong when a, ia and ja are read from files?

this is how I compile it:

mpiicc -std=c99 -DMKL_ILP64 -I${MKLROOT}/include sparse_linsys_cluster.c -o sp_ls_cl -L${MKLROOT}/lib/intel64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -lmkl_blacs_intelmpi_ilp64 -liomp5 -lpthread -lm -ldl

Gennady_F_Intel · ‎04-10-2019

I see the same results with ILP64 API linked:

mpicc -I/opt/intel/compilers_and_libraries_2019/linux/mkl/include -DMKL_ILP64 sparse_linsys_cluster.cpp -o a_ilp64.out \
-Wl,--start-group \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_intel_ilp64.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_intel_thread.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_core.a \
/opt/intel/compilers_and_libraries_2019.3.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_ilp64.a \
-Wl,--end-group -liomp5 -lpthread -lm -ldl

mpirun -n 8 ./a_ilp64.out

=== CPARDISO: solving a real nonsymmetric system ===
1-based array indexing is turned ON
CPARDISO double precision computation is turned ON
METIS algorithm at reorder step is turned ON
Scaling is turned ON
Matching is turned ON

Summary: ( reordering phase )
================

.........................

........................

TEST PASSED

Gennady_F_Intel · ‎04-10-2019

MKL 2019 update 3

Arrigoni__Viviana · ‎04-11-2019

I think that in the cluster I am using, MKL v 2018 is installed. I have tried to compile with mpicc and using the same linking options and flags that you used, and compiling both dynamic and static. I get some linking error.
I really don't see what is the problem with reading the arrays from files, since it is working with you. I have contacted the support of the cluster that I am using, hopefully they will help me with this very specific issue. Thank you very much.

Arrigoni__Viviana · ‎04-12-2019

If some other people face the same problem, I solved it: it was enough to allocate arrays a, ia and ja dynamically.