Single precision for Complex Matrices

programmer85 · ‎05-13-2010

I made a comment to your article:
http://software.intel.com/en-us/articles/single-precision-real-and-complex-data-support-in-pardiso/
... but it was deleted. I have also write about them in :
http://software.intel.com/en-us/forums/showthread.php?t=67789
but nowone replied.

In this case I am creating a new thread:

I am trying to use single precision for complex matrices.

I did not have any problems with programming double pressision using doublecomplex structure (iparm(28) = 0). But when I tried to use single precision it did not work. Below I will attach my code - maybe you find mistakes

void setPardisoCf_9(PardisoCf_struct * ps, Matrix_MKLVector * mkl_vec, int* ria, int* rja, double * raRe, double * raI, int varN, int varNN ){
int i;

ps->n = varN;

ps->fa = (doublecomplex*)malloc(sizeof(doublecomplex*)*2*varNN);
ps->fx = (doublecomplex*)malloc(sizeof(doublecomplex*)*2*varN);
ps->fb = (doublecomplex*)malloc(sizeof(doublecomplex*)*2*varN);
ps->ria = (int*)malloc(sizeof(int)*varNN);
ps->rja = (int*)malloc(sizeof(int)*(varN+1));

ps->mtype = 6;// complex symmetic
ps->nrhs = 1;

/* -------------------------------------------------------------------- */
/* .. Setup Pardiso control parameters. */
/* -------------------------------------------------------------------- */
for ( i = 0; i < 64; i++) {
ps->iparm = 0;
}
ps->iparm[0] = 1; /* No solver default */
ps->iparm[1] = 2; /* Fill-in reordering from METIS */

/* Numbers of processors, value of OMP_NUM_THREADS */
ps->iparm[2] = mkl_get_max_threads();
ps->iparm[3] = 0; /* No iterative-direct algorithm */
ps->iparm[4] = 0; /* No user fill-in reducing permutation */
ps->iparm[5] = 0; /* Write solution into x */
ps->iparm[6] = 0; /* Not in use */
ps->iparm[7] = 2; /* Max numbers of iterative refinement steps */
ps->iparm[8] = 0; /* Not in use */
ps->iparm[9] = 8; /* Perturb the pivot elements with 1E-13 */
ps->iparm[10] = 1; /* Use nonsymmetric permutation and scaling MPS */
ps->iparm[11] = 0; /* Not in use */
ps->iparm[12] = 0; /* 1 Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy */
ps->iparm[13] = 0; /* Output: Number of perturbed pivots */
ps->iparm[14] = 0; /* Not in use */
ps->iparm[15] = 0; /* Not in use */
ps->iparm[16] = 0; /* Not in use */
ps->iparm[17] = -1; /* Output: Number of nonzeros in the factor LU */
ps->iparm[18] = -1; /* Output: Mflops for LU factorization */
ps->iparm[19] = 0; /* Output: Numbers of CG Iterations */
ps->iparm[27] = 1; /* float */

ps->maxfct = 1; /* Maximum number of numerical factorizations. */
ps->mnum = 1; /* Which factorization to use. */
ps->msglvl = 1; /* Print statistical information in file */
ps->error = 0; /* Initialize error flag */

for (i = 0; i < varNN; i++) {
ps->fa.re = raRe;//(double)ra;
ps->fa.i = raI;//(double)ra;
ps->ria = ria;
}

for (i = 0; i < varN+1; i++) {
ps->rja = rja;
if(i ps->fx.re = 0;
ps->fx.i = 0;
ps->fb.re = 0;
ps->fb.i = 0;
}

}

ps->phase = 11;
PARDISO (ps->pt, &ps->maxfct, &ps->mnum, &ps->mtype, &ps->phase,
&varN, ps->fa, ps->rja, ps->ria, &ps->idum, &ps->nrhs,
ps->iparm, &ps->msglvl, &ps->ddum, &ps->ddum, &ps->error);

if (ps->error != 0) {
printf("\\nERROR during symbolic factorization: %d", ps->error);
getch();
exit(1);
}

And so far there are now errors during execution. The errors occurs when I try to solve:

ps->phase = 33;
ps->iparm[7] = 0; /* Max numbers of iterative refinement steps. */

PARDISO (ps->pt, &ps->maxfct, &ps->mnum, &ps->mtype, &ps->phase,
&ps->n, ps->fa, ps->rja, ps->ria, &ps->idum, &ps->nrhs,
ps->iparm, &ps->msglvl, ps->fb, ps->fx, &ps->error);

What surprise me is the fact that if I execute above procedures with ps->iparm[27] = 0; /* double precision */ everything is fine... . should I change any other iparm ?

Sergey_Solovev__Inte · ‎05-13-2010

If you use single precision (ps->iparm[27] = 1), then values ps->fa.re, ps->fa.i,ps->fx.re, ps->fx.i, ps->fb.re = 0, ps->fb.i = 0 should be float.

You can create new structure
typedef struct{
float re;
float i;}
floatcomplex;
and replace type "doublecomplex" to "floatcomplex" in the example.

programmer85 · ‎05-14-2010

thx - it helped. I achieved about 10% speedup between execution in single and double precision, and hhonestly I expected much more...

Sergey - do you have any test that compares execution in single and double precision ?
Programmer

Sergey_Solovev__Inte · ‎05-16-2010

The speedup depends on many factors type of matrix, size of matrix, architecture ant etc. But we dont have tests which compare single precision with double one.