Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL Threads- BLAS level 2 routines

kris_nagar
Beginner
367 Views
Multithreading does not seem to work in my program where I am using mkl_dcsrmv subroutine to multiply large sparse matrices. I have tried using "mkl_set_num_threads(num_threads)" to set the number of threads to be used. The program gives correct output but the performance doesn't change as I change the number of threads.
According to mkl manual, mkl version >10.0 should maximum possible number of threads on processor, but that does not seem to be the case.
Platform: Intel Xeon E5520 (4 cores/8 threads).
#include "omp.h"
...
...
mkl_dcsrmv("N", &M, &N, α, "G**C", val, (int *)col, (int *)ptr, (int *)ptre, vec_aligned, α, y_vec);
...
Compile:
icc -mkl -I /opt/intel/Compiler/11.1/069/mkl/include/-L$(MKLROOT)/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -openmp -lpthread -o run_mkl
Is mkl_dcsrmv a threaded routine?
0 Kudos
1 Solution
TimP
Honored Contributor III
367 Views
In addition to what Gennady said, you might find it interesting (if using dynamic libiomp) to set
LD_PRELOAD=/libiompprof5.so
and look at the guide.gvs file generated.

View solution in original post

0 Kudos
4 Replies
Gennady_F_Intel
Moderator
367 Views

1)Yes, this routine is threaded internally, butthe main question what the scalability numbers are you expecting to see?

In the mostly cases, for the sparse matrixes, these are the cache and memory bandwidth problems.

2)Please see here how to link MKL more properly

0 Kudos
TimP
Honored Contributor III
368 Views
In addition to what Gennady said, you might find it interesting (if using dynamic libiomp) to set
LD_PRELOAD=/libiompprof5.so
and look at the guide.gvs file generated.
0 Kudos
kris_nagar
Beginner
367 Views
Thanks both of you for the reply.
I expect a speedup of 3-4x when going from serial to multithreaded code. And I am using matrices of size 8mx8m with 118 million entries.
From the guide.gvs file, I found that my program is not using 8 threads even when I try to set the threads manually.
I have another program where I use sgemm routine to multiply dense matrices. And that code uses multithreading. I am using the same settings and platform for both the programs.
Thanks again!
0 Kudos
kris_nagar
Beginner
367 Views
Finally got it working.. I just updated the icc version and now its invoking all the threads available.
0 Kudos
Reply