Showing results for

- Intel Community
- Software Development SDKs and Libraries
- Intel® oneAPI Math Kernel Library & Intel® Math Kernel Library
- mkl_scscmm performance problem

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Highlighted
##

Ramakrishnan_K_

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-01-2016
01:06 PM

10 Views

mkl_scscmm performance problem

Hello,

I am building the attached program in OLCF's RHEA supercomputer (https://www.olcf.ornl.gov/computing-resources/rhea/) with Intel compiler icc (ICC) 14.0.4 20140805. Armadillo has a naive implementation of cscmm. I run the program with MKL_NUM_THREADS=1 in rhea to multiply the sparse matrix of size 83328x124992 with a dense matrix of size 124992x50. The following is the output.

./a.out 83328 124992 50 0.00001

The output of the test code

::A::83328x124992

nnz::104153

::B::124992x50

mkl cscmm::162.13

::C::50x83328

arma ::0.06

I am seeing MKL_CSCMM to be really slow over armadillo naive implementation. Kindly let me know what am I doing wrong here.

Ramki

1 Reply

Highlighted
##

may you check if this gap exists with the latest mkl 11.3.update 3? you may take evaluation package of intel complier v.16. May you also check CSR ( mkl_dcsrmm ) case?

Gennady_F_Intel

Moderator

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-02-2016
10:14 PM

10 Views

For more complete information about compiler optimizations, see our Optimization Notice.