pardiso transpose solve vs non-transpose solve

j_s · ‎04-11-2018

Hello,

I am wondering what differences to expect between loading in a CSR3 matrix and doing a solve, and loading in a CSC3 matrix and doing a transpose solve. The ia and ja matrices are swapped appropriately in calling the solver. This is for an unsymmetric matrix, mtype=11.

In essence the 2 matrices have identical entries, except one is transposed. I am seeing numerical differences in the convergence. Using CSC3, all of my examples converge. Using CSR3, one fails.

Is it reasonable to expect the same results. This would then lead to issues with my CSR3 implementation.

Alexander_K_Intel2 · ‎04-12-2018

Hi,

As i understand you see significant difference between transpose solve of matrix and non-transpose solve of transpose matrix? If so then can you provide example of this matrix

Thanks,

Alex

j_s · ‎04-12-2018

I can. Is there a matrix dumping routine I can use?

j_s · ‎04-12-2018

Attached is the first iteration for the simulation that fails.

csc.py contains the transposed CSC matrix

csr.py contains the CSR matrix

foo.py imports the 2 files and compares them.

below are the options used to the solver

iparm[0] = 1; /* No solver default */
// iparm[1] = 0; /* Fill-in reordering from METIS */
iparm[1] = 2; /* Fill-in reordering from METIS */
iparm[3] = 0; /* No iterative-direct algorithm */
iparm[4] = 0; /* No user fill-in reducing permutation */
iparm[5] = 0; /* Write solution into x */
iparm[6] = 0; /* Not in use */
iparm[7] = 0; /* Max numbers of iterative refinement steps */
iparm[8] = 0; /* Not in use */
iparm[9] = 13; /* Perturb the pivot elements with 1E-13 */
iparm[10] = 1; /* Use nonsymmetric permutation and scaling MPS */
iparm[11] = 0; /* Conjugate transposed/transpose solve */
iparm[12] = 1; /* Maximum weighted matching algorithm is switched-on (default for non-symmetric) */
iparm[13] = 0; /* Output: Number of perturbed pivots */
iparm[14] = 0; /* Not in use */
iparm[15] = 0; /* Not in use */
iparm[16] = 0; /* Not in use */
iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
iparm[18] = -1; /* Output: Mflops for LU factorization */
iparm[19] = 0; /* Output: Numbers of CG Iterations */
iparm[26] = 0; /* matrix checker */
iparm[27] = 0; /* double precision */
iparm[34] = 1; /* 0 based solve */

mtype = 11; /* Real unsymmetric matrix */
maxfct = 1; /* Maximum number of numerical factorizations. */
mnum = 1; /* Which factorization to use. */
msglvl = 0; /* Print statistical information */
error = 0; /* Initialize error flag */
phase = 0;
ddum = 0.0;
idum = 0;
n = numeqn;
nrhs = 1;

j_s · ‎04-20-2018

Hello Alex,

Did you have a chance to look at my example?

mecej4 · ‎04-23-2018

I think that the large size of the matrix and the fact that you embedded the matrices in python source code both tend to discourage people from looking at the issue.

I can give you some findings that may help. I have checked that the two files csr.py and csc.py do contain one matrix each, and that one matrix is the transpose of the other (all entries are such that their relative difference is < 10^-15).

I do not know what parameters you gave Pardiso or what the RHS vector b was in your work; I solved the equations using b = 1, and found that the the first ten and last ten of the resulting x values ranged in magnitude from 10¹⁵ to 10³⁹. The solutions from (i) the data in csr.py and (ii) the data in csc.py (with transposition of the matrix before calling Pardiso) agreed to about 4 significant digits. This indicates that you may have a scaling problem or, worse, a matrix that is poorly conditioned. These remarks need to be revised after running with the correct b vector instead of all unit values, as I did.

j_s · ‎04-27-2018

Thanks for looking at the matrix. I did not know what format to submit the matrix in to the forum, so I wrote this out from C++ to python.

This simulation is solving a problem in a cylindrical coordinate system, meaning that vertices toward the outside of the structure are much more heavily weighted than the interior. Most of my other test cases are in a regular coordinate system, and have no problems with the CSR matrix. I am still curious as to why the transpose solve of the CSC works, but the solve of the CSR solve does not.

Are there any settings I should try in my iparm array above?