wrong answer from pdgemv

oguren · ‎02-10-2010

I want to find y =Ax with PDGEMV. Igenerate A and X global and distrubute it to 2x2 grid in row major.my code is below

Simply :A is like that

A=[147 10 13;3 69 12 15;5811 14 17;7 10 13 16 19;9 12 15 18 21] and x=[1;1;0;0;1]

I compile the code like this

>>mpiicc -w -c dot.c

>>mpiicc -o dot dot.o -I/RS/progs/intel/mkl/10.1.1.019/include -L/RS/progs/intel/mkl/10.1.1.019/lib/em64t -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lmkl_lapack -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_lapack -lmkl_core -liomp5 -lpthread
>> bsub -q my.q -n 4 -a intelmpi -e error.txt -o out2.txt mpirun.lsf ./dot

Result is wrong, it gives

rank=0 13.00
rank=0 22.00
rank=0 49.00
rank=2 32.00
rank=2 34.00

I think I distribute A andX to the processors correctly. Can you help what is wrong. I really can't figure out what is wrong.I try different combination for desc funcitons and "N" and "T" options for pdgemv. No true result.

My code is below.

#include
#include
#include
#include "mpi.h"

#define AA(i,j) AA[(i)*M+(j)]

int main(int argc, char **argv) {
int i, j, k;
/************ MPI ***************************/
int myrank_mpi, nprocs_mpi;
MPI_Init( &argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank_mpi);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs_mpi);
/************ BLACS ***************************/
int ictxt, nprow, npcol, myrow, mycol,nb;
int info,itemp;
int ZERO=0,ONE=1;
nprow = 2; npcol = 2; nb =2;
Cblacs_pinfo( &myrank_mpi, &nprocs_mpi ) ;
Cblacs_get( -1, 0, &ictxt );
Cblacs_gridinit( &ictxt, "Row", nprow, npcol );
Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
int M=5;
double *AA = (double*) malloc(M*M*sizeof(double));
for(i=0;i for(j=0;j AA[i*M+j]=(2*i+3*j+1);

double *X = (double*) malloc(M*sizeof(double));
X[0]=1;X[1]=1;X[2]=0;X[3]=0;X[4]=1;

int descA[9],descx[9],descy[9];
int mA = numroc_( &M, &nb, &myrow, &ZERO, &nprow );
int nA = numroc_( &M, &nb, &mycol, &ZERO, &npcol );
int nx = numroc_( &M, &nb, &mycol, &ZERO, &npcol );
int my = numroc_( &M, &nb, &myrow, &ZERO, &nprow );
descinit_(descA, &M, &M, &nb, &nb, &ZERO, &ZERO, &ictxt, &mA, &info);
descinit_(descx, &ONE, &M, &ONE, &nb, &ZERO, &ZERO, &ictxt, &ONE, &info);

descinit_(descy, &M, &ONE, &nb, &ONE, &ZERO, &ZERO, &ictxt, &my, &info);
double *x = (double*) malloc(nx*sizeof(double));
double *y = (double*) calloc(my,sizeof(double));
double *A = (double*) malloc(mA*nA*sizeof(double));
int sat,sut;
for(i=0;i

for(j=0;j sat= (myrow*nb)+i+(i/nb)*nb;
sut= (mycol*nb)+j+(j/nb)*nb;
A[i*nA+j]=AA(sat,sut);
}

for(i=0;i sut= (mycol*nb)+i+(i/nb)*nb;
x=X[sut];
}

double alpha = 1.0; double beta = 0.0;
pdgemv_("N",&M,&M,α,A,&ONE,&ONE,descA,x,&ONE,&ONE,descx,&ONE,β,y,&ONE,&ONE,descy,&ONE);

Cblacs_barrier(ictxt,"A");
for(i=0;i printf("rank=%d %.2f \\n", myrank_mpi,y);
Cblacs_gridexit( 0 );
MPI_Finalize();
return 0;
}

Alex_Kosenkov · ‎02-11-2010

Hello, Oguren

Try to use col-major - not row-major. I hope it will resolve your problem.

Best regards,

Alexander

Ivan_Latkin__Intel_ · ‎02-11-2010

Hello, Oguren!

Vectors x and y both should becolumn-vectors. Try to usefor x:

int nx = numroc_( &M, &nb, &myrow, &ZERO, &nprow );

descinit_(descx, &M, &ONE, &nb, &ONE, &ZERO, &ZERO, &nx, &ictxt);

and

for(i=0;i sat= (myrow*nb)+i+(i/nb)*nb;
x=X[sat];
}

Best regards,

Ivan Latkin

oguren · ‎02-11-2010

Hello,

thank for you help, but it still gives wrong result.I'm sending all my code below .I reallyfind no way out. I can't understand what is wrong. please help me...

#include
#include
#include
#include "mpi.h"
#define AA(i,j) AA[(i)*M+(j)]
#define max(a,b) ((a) > (b) ? (a) : (b))
#define abs(a) max((a),-(a))

int main(int argc, char **argv) {
int i, j, k;
double MPIt1, MPIt2, MPIelapsed;
/************ MPI ***************************/
int myrank_mpi, nprocs_mpi;
MPI_Init( &argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myrank_mpi);
MPI_Comm_size(MPI_COMM_WORLD, &nprocs_mpi);
/************ BLACS ***************************/
int ictxt, nprow, npcol, myrow, mycol,nb;
int info,itemp;
int ZERO=0,ONE=1;
nprow = 2; npcol = 2; nb =2;
Cblacs_pinfo( &myrank_mpi, &nprocs_mpi ) ;
Cblacs_get( -1, 0, &ictxt );
Cblacs_gridinit( &ictxt, "R", nprow, npcol );
Cblacs_gridinfo( ictxt, &nprow, &npcol, &myrow, &mycol );
int M=5;
double *AA = (double*) malloc(M*M*sizeof(double));
for(i=0;i for(j=0;j AA[i*M+j]=i*M+j+1;

double *X = (double*) malloc(M*sizeof(double));
X[0]=1;X[1]=0;X[2]=1;X[3]=0;X[4]=1;

int descA[9],descx[9],descy[9];
int mA = numroc_( &M, &nb, &myrow, &ZERO, &nprow );
int nA = numroc_( &M, &nb, &mycol, &ZERO, &npcol );
int nx = numroc_( &M, &nb, &myrow, &ZERO, &nprow );
int my = numroc_( &M, &nb, &myrow, &ZERO, &nprow );
descinit_(descA, &M, &M, &nb, &nb, &ZERO, &ZERO, &ictxt, &mA, &info);
descinit_(descx, &M, &ONE, &nb, &ONE, &ZERO, &ZERO, &ictxt, &nx, &info);
descinit_(descy, &M, &ONE, &nb, &ONE, &ZERO, &ZERO, &ictxt, &my, &info);

double *x = (double*) calloc(nx,sizeof(double));
double *y = (double*) calloc(my,sizeof(double));
double *A = (double*) calloc(mA*nA,sizeof(double));

Cblacs_barrier(ictxt,"A");
printf("\n");
int sat,sut;
for(i=0;i for(j=0;j sat= (myrow*nb)+i+(i/nb)*nb;
sut= (mycol*nb)+j+(j/nb)*nb;
A[i*nA+j]=AA(sat,sut);
}

for(i=0;i sat= (myrow*nb)+i+(i/nb)*nb;
x=X[sat];
}

Cblacs_barrier(ictxt,"A");
double alpha = 1.0; double beta = 0.0;
pdgemv_("N",&M,&M,α,A,&ONE,&ONE,descA,x,&ONE,&ONE,descx,&ONE,β,y,&ONE,&ONE,descy,&ONE);
Cblacs_barrier(ictxt,"A");
for(i=0;iprintf("rank=%d y[%d]=%.2f \n", myrank_mpi,i,y);
Cblacs_gridexit( 0 );
MPI_Finalize();
return 0;
}

true result from matlab like that:

9
24
39
54
69

But pdgemv gives:

rank=0 y[0]=25.00
rank=0 y[1]=28.00
rank=0 y[2]=38.00
rank=2 y[0]=41.00
rank=2 y[1]=46.00
rank=1 y[0]=0.00
rank=1 y[1]=0.00
rank=1 y[2]=0.00
rank=3 y[0]=0.00
rank=3 y[1]=0.00

Best regards,

guren

Ying_H_Intel · ‎02-11-2010

Hello Guren,

It should be done for all matrices which were fed inpdgemv_ this way the sub matrix will be stored in col-major order.

Regarding the code:

A[i*nA+j]=AA(sat,sut);

should be changed into

A[j*mA+i]=AA(sat,sut);

Then you will get right result.

Best Regards,

Ying

oguren · ‎02-15-2010

Hi,

Thank you very much, now it works. To understand what fortran does is very difficult for aC user

Best Regards