Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Rulli__Cosimo
Beginner
146 Views

Sparse-Dense Matrix Multiplication

Hi,

I am working with the mkl_sparse_s_mm routine to perform:

C = A * B

where A is a sparse matrix and C,B are dense matrices. I would like to have some details about the algorithm for sparse-dense matrix multiplication implemented by mkl_sparse_s_mm, i.e., if it uses cache aware strategies, specific micro-kernel implementations to fully leverage CPU registers etc..

Just to provide an example, high performance dense matrix multiplication (cblas_gemm) is usually implemented following the block pack algorithm.

Thank you

Cosimo Rullli

 

 

0 Kudos
2 Replies
Gennady_F_Intel
Moderator
146 Views

it is the proprietary info, but we will ask sparse blas developers to look at this thread and they probably will share something, but I am not sure.

Kirill_V_Intel
Employee
146 Views

Hello,

As Gennady said, it is a closed information. A general answer is yes, we try to do different (incl. hardware-aware) optimizations on different levels (multi-tthreading, vectorization and assembly) in order to achieve better performance, as we do in general in MKL.

You can find articles about different algorithms which can be used for sparse-dense matrix multiplication.

The sparsity of one of the inputs implies that some optimization techniques used for dense gemm make much less or no sense at all and at the same time, some additional ideas are used. It also helps to think whether the functionality is compute or memory bound.

Best,
Kirill
 

Reply