Still trying to solve very large systems of sparse equations with `mkl pardiso`, single node, multiple thread version. The solver is behaving really well for small systems, but not so much for larger equations. In our case, scalability is essential.
With a large enough system of linear equations (79999 x 79999, 100528321 non-zeros), the sovler returns a vectors of `-nan`s, without reporting any errors. The expected result is provided (`expected-result.txt`).
The example code is provided below. The data is linked to here: https://www.dropbox.com/s/jcvieffrkb7ivag/data.tar.gz?dl=0. It is a `tar.gz` archive with the binary matrix representation, read by the code provided below.
There is an additional strange behaviour of the solver, where the `allocation of internal data structures` in the factorization phase is dominating the runtime. It is taking one user thread (100% user time), and take the majority of the executable's time. This behaviour is similar to that described here: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/596445. I believe the core of the issue is within `parMETIS`, not `pardiso`.
The code is linked against MKL `11.2.3`, release `composer_xe_2015.3.187`.
Thank you for your assistance. I would be happy to provide additional details.
You are right, this is not a `METIS` issue. The situation with 11.3 and minimum-degree algorithm is the same (aside from being significantly slower on the re-order step).
Please see the log attached.
Just as an aside, the executable, should you want to run it, takes two command line arguments - N (the matrix size) and NZ (the non-zero entries). The parameters for the provided input are: `./solver 79999 100528321`
I see two problems here:
1. this matrix is inconsistency with CSR representation. I checked this case with pardiso ( not pardiso_64() ) API with Matrix Checker enabled. here is the log I see on my side:
Major version: 11
Minor version: 3
Update version: 1
Product status: Product
Platform: Intel(R) 64 architecture
Processor optimization: Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors
*** Error in PARDISO (incorrect input matrix ) error_num= 21
*** Input check: i=79998, ia=50263452, ia[i+1]=0 are incompatible
ERROR during symbolic factorization: 4294967295
2. the second topic is the problem on our side, because when pardiso_64() is used, the matrix checker doesn't work properly when he called from Pardiso_64(). This problem is escalated.