Ramakrishnan_K_

07-01-2016
01:06 PM

mkl_scscmm performance problem

Hello,

I am building the attached program in OLCF's RHEA supercomputer (https://www.olcf.ornl.gov/computing-resources/rhea/) with Intel compiler icc (ICC) 14.0.4 20140805. Armadillo has a naive implementation of cscmm. I run the program with MKL_NUM_THREADS=1 in rhea to multiply the sparse matrix of size 83328x124992 with a dense matrix of size 124992x50. The following is the output.

./a.out 83328 124992 50 0.00001

The output of the test code

::A::83328x124992

nnz::104153

::B::124992x50

mkl cscmm::162.13

::C::50x83328

arma ::0.06

I am seeing MKL_CSCMM to be really slow over armadillo naive implementation. Kindly let me know what am I doing wrong here.

Ramki

may you check if this gap exists with the latest mkl 11.3.update 3? you may take evaluation package of intel complier v.16. May you also check CSR ( mkl_dcsrmm ) case?

Gennady_F_Intel

07-02-2016
10:14 PM

