Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Performance issues with Sparse Blas

abhimodak
New Contributor I
546 Views

Hi

I am doing some tests with sparse blas (mkl_dcoomm) and I find that although it is faster than the intrinsic matmul available in Fortran, the blas subroutine gemm (I am using the Fortran 95 interface but the results remain unchanged if I use dgemm) is faster even when the matrix is 90% sparse.

I am generating the sparse matrices use random number generator.

I must say that I am not 'surprised' by gemm being faster as such but I am a bit startled by 'how sparse' the matrix needs to be for the sparse blas to win the race.

Are there any guidelines on this topic/ (i.e. when would it be better to use the blas call).

Abhi

0 Kudos
1 Reply
Sergey_K_Intel1
Employee
546 Views

Dear Abhi,

The performance of sparse matrix operations tends to be much lower than dense BLAS because the memory access patterns are irregular and the ratio of float point operations is lower than in some dense operations. Like dense matrices, the performance depends on machine architecture, but unlike dense problems, the performance also depends on the structure of the matrix. So it is better to test MKL Sparse BLAS with the help of matrices having about the same structure as in your applications. And difference between performance results for random matrices and your real matrices can be very large.

It should also be noted that the coordinate format is the slowest among all sparse formats supported by MKL due to more often usage of indirect addressing than for the rest of MKL sparse BLAS formats.

Sparse formats are mostly used for matrices of large sizes when we have to use sparse format in order to keep the whole matrix in RAM. So Sparse BLAS and dense BLAS have different areas of applications.

All the best

Sergey

0 Kudos
Reply