DGEQRF routine in MKL

gupta_1978 · ‎09-04-2008

I doing some test on multicore processor using MKL library. For that reason, I am testing DGEQRF routine of MKL library it is a very small program. My problem is that when I use one openmp

one thread I get time
real 3m36.934s
user 3m30.713s
sys 0m2.300s

when using 2 threads

real 0m27.402s
user 0m42.727s
sys 0m1.032s

ratio is approx 7 times higher while it should be approx 2 times. Please someone could explain me this?

compiling option%

gcc-4.2 -I/opt/intel/mkl/10.0.4.023/include/ -fPIC text.c -fopenmp -L/opt/intel/mkl/10.0.4.023/lib/32 -lmkl_gf -liomp5 /opt/intel/mkl/10.0.4.023/lib/32/libmkl_lapack.a /opt/intel/mkl/10.0.4.023/lib/32/libmkl_gnu_thread.a /opt/intel/mkl/10.0.4.023/lib/32/libmkl_cdft_core.a -ldl -lm

program file%

#include
#include
#include
#include
int main(int argc, char *argv[])
{
int M, N,j,LWORK,info;
double *val,*tau,*work;

M=200000;
N=500;

val = malloc(M*N*sizeof(double));
tau = malloc(M*sizeof(double));
work = malloc(M*sizeof(double));

for (j=0; j val=floor(100*( ( (double)rand()*2.0/RAND_MAX )-1.0 ) );
}
LWORK=N;
dgeqrf_(&M,&N,val,&M,tau,work,&LWORK,&info);
}

thanks a lot
Alok

TimP · ‎09-04-2008

If you are using a smaller than recommended work array, it may slow down the one thread case the most.

gupta_1978 · ‎09-04-2008

thank you tim! but, I do not undestand this because I try many combination of arrays and but still with one thread I get very bad performance else I think it is calling some other routine...

the combination I tried::
time 2 thread time 1 thread
M=N=2000 1.8sec 10 sec
M=N=3000 5 sec 60 sec
M=N=4000 10 sec 136 sec
M=N=5000 24 sec 278 sec

I am not able to understand the logic and I am using 10.0.4.023 version and also with one thread I am having the same time as from LAPACK.

thanks a lot
Alok

TimP · ‎09-04-2008

However, I can't see that you have followed the advice in the doc:
http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_geqrf.html
Many people actually want to exaggerate the gain for threading; I don't see that you have specified what you see as a problem.