Pardiso and DSS forward solve problem/question

Atte_Moilanen · ‎09-01-2010

Hi!

I wonder if anyone can help with or verify real problem with the following, which has me stumped and questioning my sanity. Question is about what could be (hope not)an error in the Pardiso and DSS forward solve routine, seen e.g. in the following problem:

Consider system Ax=b, where the numerical values are

3 1 0 1
1 2 2 x = 2
0 2 3 3

This system has solution (0.66667, -1, 1.66667), which is correctly found by Pardiso and DSS.

However, something funny appears to happen with the forward solve mode: According to the MKL 10.2 manual, Pardiso and DSS do for sparse symmetric positive definite systems an A=LL^T Cholesky factorization. The correct solution for L (verified by Mathematica, and which can be verified manually as well), is

sqrt(3) 0 0
L = 1/sqrt(3) sqrt(5/3) 0
0 2*sqrt(3/5) sqrt(3/5)

The pardiso and DSS step 331 forward substitution should solve Ly=b. Fromabove one sees by substitution that sqrt(3)x1=1 => x1 = 1/sqrt(3), and so on, giving solution for y, y=(0.57735, 1.29099, 1.29099).

Instead of giving this result, Pardiso and DSS both give (0.57735, -0.57735, 1.73205).

From several things we tried, I should note the following:
- Iterative refinement was turned off
-IF the upper right hand corner element is entered as zero (logically present making it a full upper triangle, but with one element with value zero), both pardiso and DSS return the correct result. This strongly suggests that DSS operates wrong with the sparse structure.
- Result did not seem to depend on whether full solve has been done before fwd solve or not.
- It occurs also with Pardiso default parameters
- It occurred for another researcher who independently implemented the problem using DSS, suggesting that at the very least documentation is misleading
- The same error occurred with larger matrixes as well, which first prompted us to try verify operation with the small matrix above. (The full solve works fine.)
- The computer is pretty typical 2-processor Dell laptop, running 64bit Windows 7, MKL 10.2, and the Intel C++ compiler under QT creator.

I have pasted the relevant bit of code below. It is effectively the symmetric positive definite matrix sample code from ref manual section C, with the matrix replaced by the matrix above, and solve put to fwd solve without iterative refinement.

A solution to this problem or even verification of error condition would be greatly appreciated. We really need both forward and backward solutions working in our application, and it would be great to be able to proceed. We cannot work with dense Cholesky as the matrix dimension may go up to millions.

Additionally, I'd add a vote to earlier mails asking for Cholesky factors being made visible from Pardiso/DSS. The decomposition has several uses in computation. At the very least, Pardiso could report the determinant, which can be computed from the diagonal of the L matrix (DSS does this fine, = 3 for the matrix above).

Kind regards - thanks for any help - and sorry for the slightly wordy description above!

Atte Moilanen, University of Helsinki, Finland

------------------ the relevant function -------------------------

#include
#include
#include
#include

#include "mkl_pardiso.h"
#include "mkl_types.h"
#include "mkl_dss.h"

/***************************************************************/
int MKL_DSS_tst() {
#define NROWS 3
#define NCOLS 3
#define NNONZEROS 5
#define NRHS 1
#if defined(MKL_ILP64)
#define MKL_INT long long
#else
#define MKL_INT int
#endif
static const MKL_INT nRows = NROWS ;
static const MKL_INT nCols = NCOLS ;
static const MKL_INT nNonZeros = NNONZEROS ;
static const MKL_INT nRhs = NRHS ;
static _INTEGER_t rowIndex[NROWS+1] = { 1, 3, 5, 6};
static _INTEGER_t columns[NNONZEROS] = { 1, 2, 2, 3, 3 };
static _DOUBLE_PRECISION_t values[NNONZEROS] = { 3,1,2,2,3 };
static _DOUBLE_PRECISION_t rhs[NCOLS] = { 1, 2, 3};
MKL_INT i;
/* Allocate storage for the solver handle and the right-hand side. */
_DOUBLE_PRECISION_t solValues[NROWS];
_MKL_DSS_HANDLE_t handle;
_INTEGER_t error;
_CHARACTER_STR_t statIn[] = "determinant";
_DOUBLE_PRECISION_t statOut[5];
MKL_INT opt = MKL_DSS_DEFAULTS;
MKL_INT sol_opt = MKL_DSS_FORWARD_SOLVE+MKL_DSS_REFINEMENT_OFF;
MKL_INT sym = MKL_DSS_SYMMETRIC;
MKL_INT type = MKL_DSS_POSITIVE_DEFINITE;
/* ---------------------*/
/* Initialize the solver */
/* ---------------------*/
error = dss_create(handle, opt );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* -------------------------------------------*/
/* Define the non-zero structure of the matrix */
/* -------------------------------------------*/
error = dss_define_structure(
handle, sym, rowIndex, nRows, nCols,
columns, nNonZeros );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------*/
/* Reorder the matrix */
/* ------------------*/
error = dss_reorder(handle, opt, 0);
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------*/
/* Factor the matrix */
/* ------------------*/
error = dss_factor_real( handle, type, values );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------------*/
/* Get the solution vector */
/* ------------------------*/
error = dss_solve_real( handle, sol_opt, rhs, nRhs, solValues );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ------------------------*/
/* Get the determinant (not for a diagonal matrix) */
/*--------------------------*/
if ( nRows < nNonZeros ) {
error = dss_statistics(handle, opt, statIn, statOut);
if ( error != MKL_DSS_SUCCESS ) goto printError;
/*-------------------------*/
/* print determinant*/
/*-------------------------*/
printf(" determinant power is %g \\n", statOut[0]);
printf(" determinant base is %g \\n", statOut[1]);
printf(" Determinant is %g \\n", (pow(10.0,statOut[0]))*statOut[1]);
}
/* --------------------------*/
/* Deallocate solver storage */
/* --------------------------*/
error = dss_delete( handle, opt );
if ( error != MKL_DSS_SUCCESS ) goto printError;
/* ----------------------*/
/* Print solution vector */
/* ----------------------*/
printf(" Solution array: ");
for(i = 0; i< nCols; i++)
printf(" %g", solValues );
printf("\\n");
exit(0);
printError:
printf("Solver returned error code %d\\n", error);
exit(1);
}

Alexander_K_Intel2 · ‎09-01-2010

Hi,

Obviously PARDISO and DSS prepare Cholesky decomposition not of initial matrix A but of matrix PAP^T, where P - permutation matrix. That could be a reason why the result is the same for all package butintermediate solutions are different. If you want to turn off reordering you need to change iparam[4] to 1 and fill perm array.

With best regards,

Alexander Kalinkin

Atte_Moilanen · ‎09-02-2010

Hi!

And thanks for clarifying the problem; it is indeed possible to get the correct answer to the problem above by providing an user-defined permutation vector 1,2,3,... to the DSS_reorder routine.

Nevertheless, I dare suggest that just as Pardiso/DSS decodes the internal permutation before outputting the final L^Tx=y solution, the permutation should be decoded before outputting the forward solution Ly=bas well, which is useful in itself. Note also, that in the small example above, the wrong solution (using internal permutation) does not have the correct elements permuted, it actually has different elements (0.57735, -0.57735, 1.73205) vs (0.57, 1.29, 1.29), which is a bit odd, and which prevents decode from the DSS permutation, which one can access.

It seems that giving the permuation yourself has two potential drawbacks
- if no permutation perm={1,2,3,...n} is given, then performance is lost because fill-in reducing permutations are not done
- one could probably also compute a good permutation in advance (using e.g. DSS_reorder with MKL_DSS_GET_ORDER)andutilize that permutation in ones own routines, with the expense of complicating code.

Overall, would be great if Pardiso/DSS did the same re permuations for both forward and backward solution phases.

Anyways - many thanks for solving the underlying issue so quickly!

Atte

mecej4 · ‎09-02-2010

I support Atte's comments on the design of the user interfaces to DSS and PARDISO.

If the documented calling protocol allows asking for intermediate results (forward elimination results), the routines should work correctly whether or not permutations are performed. In particular, when the permutation vector is generated and maintained internally by the subroutine and remains inaccessible to the user, it is the responsibility of the subroutine to use the internally generated permutation throughout the entire task.

If, for some reason related to implementation details, internal generation of a permutation is incompatible with producing a half-solution (forward elimination), that fact should be stated in the documentation.

Atte: I am curious as to why you want to bother with the forward and backward solution steps. If there is a use for the half-solution beyond simply checking as you did, I'd like to learn about it.

Atte_Moilanen · ‎09-02-2010

Hi!

What we need this for is the following: We have a heavy numerical Bayesian estimation problem which involves dealing with spatially correlated phenomena.

Inone part of the computations, we need to generate random vectors in the dimension of up to millions, andthe correlation stucture of this vector comes from the canonical multinormal distribution, described by a large symmetric positive definite (and sparse) matrix.

The algorithm for generating from the CMNR distribution requires the use of the sparse Cholesky decomposition, and in particular, both L^Tx=b and Lx=b forms of the solution. Thus,we have an application that requires both phases of the sparse Cholesky solve, and also the determinant of the Cholesky decomposition, which can becomputed from the diagonal elements of the LLT factorization. At this point it looks like the DSS will do it, but Pardiso not, as I can't find the determinant from pardiso, and the trick of dropping the permutation (above mail) has so far worked for DSS but not Pardiso.

Anyways, we have a real interest in solving this; not just trying to be a nuissance to the support team!

Cheers,

Atte

Alexander_K_Intel2 · ‎09-02-2010

Hi Atte,

Let me clarify my idea: for example let one want to solve next systen of equations by Cholesky decomposition:

4 1 x_1 1

1 2 x_2 = 1

If you prepare LLT decomposition without reordering you get next system:

2 0 | 2 0.5 | x_1 1

0.5 sqrt(3)/2 | 0 sqrt(3)/2 | x_2 = 1

but if change first and second string/column you get next system:

sqrt(2) 0 | sqrt(2) sqrt(0.5) | x_2 1

sqrt(0.5) sqrt(7)/2 | 0 sqrt(7)/2 | x_1 = 1

It clear that finil solution will be the same (the systems are equal) but intermidiate are different.

So the intermidiate solution depend from way of permutation, thats why I asked to provide PARDISO owner's permutation.

About diagonal elements from LU decomposition: That could be done by next trick:

1. change type of system from 2 to -2 (symetric nonpositive define) than PARDISO prepare LDL^T decomposition of permutated matrix.

2. solve only diagonal step in solving step (332 phase).

3. The i-th component of solution is inverse diagonal element.

With best regards,

Alexander Kalinkin

Sergey_K_Intel1 · ‎09-03-2010

Hi Atte,

There is another workaround for your problem. As we know, PARDISO can return the internal permutation vector. So we just need to call PARDISO for phase=11 with iparm(5)=2. DSS can also return the internal permutation if dss_reorder is called with the option MKL_DSS_GET_ORDER. Thenwe can form another compressed sparse row matrixP*A*P^T where P is formedwith the help of this permutation vector. The next step is toredefinethe permuation arrayperm(i)=i andcallPARDISO for solving newsystemP*A*P^T x =P*f.

If we do all mentioned above, PARDISO consumes the same amount memory andsystem Lxor L^Tx are solved correctly. Sure this way has got computational overhead for reordering initial matrix, solution vector and right hand side, butthe overheadshould be small compared to factorization and solution. As I mentioned above memory consumption for PARDISO will be the same.

Hope it helps
All the best
Sergey

Daniel_K_1 · ‎10-24-2013

Can you please tell me where in this forum I can ask question about how to use intel mkl under qt creator with mingw???

mecej4 · ‎10-27-2013

Daniel,

Is there a reason to believe that calling MKL routines from Qt-creator with MingW is different from calling any library routines whatsoever from Qt-creator with MingW? If not, your question should be addressed to a Qt or MingW forum.