topic MKL Threads- BLAS level 2 routines in Intel® oneAPI Math Kernel Library

MKL Threads- BLAS level 2 routines

kris_nagar — Fri, 07 Oct 2011 06:13:55 GMT

Multithreading does not seem to work in my program where I am using mkl_dcsrmv subroutine to multiply large sparse matrices. I have tried using "mkl_set_num_threads(num_threads)" to set the number of threads to be used. The program gives correct output but the performance doesn't change as I change the number of threads.

According to mkl manual, mkl version >10.0 should maximum possible number of threads on processor, but that does not seem to be the case.

Platform: Intel Xeon E5520 (4 cores/8 threads).

#include "omp.h"

...

mkl_dcsrmv("N", &M, &N, α, "G**C", val, (int *)col, (int *)ptr, (int *)ptre, vec_aligned, α, y_vec);

...

Compile:

icc -mkl -I /opt/intel/Compiler/11.1/069/mkl/include/-L$(MKLROOT)/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -openmp -lpthread -o run_mkl

Is mkl_dcsrmv a threaded routine?

MKL Threads- BLAS level 2 routines

Gennady_F_Intel — Fri, 07 Oct 2011 12:36:37 GMT

1)Yes, this routine is threaded internally, butthe main question what the scalability numbers are you expecting to see?

In the mostly cases, for the sparse matrixes, these are the cache and memory bandwidth problems.

2)Please see here how to link MKL more properly

MKL Threads- BLAS level 2 routines

TimP — Fri, 07 Oct 2011 13:35:33 GMT

In addition to what Gennady said, you might find it interesting (if using dynamic libiomp) to set
LD_PRELOAD=/libiompprof5.so
and look at the guide.gvs file generated.

MKL Threads- BLAS level 2 routines

kris_nagar — Sun, 09 Oct 2011 18:00:54 GMT

Thanks both of you for the reply.

I expect a speedup of 3-4x when going from serial to multithreaded code. And I am using matrices of size 8mx8m with 118 million entries.

From the guide.gvs file, I found that my program is not using 8 threads even when I try to set the threads manually.

I have another program where I use sgemm routine to multiply dense matrices. And that code uses multithreading. I am using the same settings and platform for both the programs.

Thanks again!

MKL Threads- BLAS level 2 routines

kris_nagar — Tue, 11 Oct 2011 20:20:11 GMT

Finally got it working.. I just updated the icc version and now its invoking all the threads available.