Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6977 Discussions

Sparse blas Matrix-vector vs simple implementation

tae
Beginner
278 Views
Hello,
I'm testing performance of mkl 9.1 on AMD Athlon 2200 processor.
I'm comparing the multiplication of sparse matrix to a vector.
When i compare

mkl_dcsrgemv

against
my simple function shown below

...
for (int i = 0; i < n; i++)
{
x = 0;

for (int j = ia; j < ia[i+1]; j++)
{
int col = ja[j-1] - 1;
x += v[col] * a[j-1];
}
}
...

I'm not getting any significant speed up after my code is compiled with optimization flag in gcc (-O2). I'm seeing only ~2-3% speedup.

Is it a normal behavior or should i be expecting a lot more speedup by using sparse-blas routines?

Thanks


0 Kudos
1 Reply
Sergey_K_Intel1
Employee
278 Views

Hello

The performance of the routine mentioned by youdepends on the structure of the inputsparse matrix since the distribution of the nonzero elements in a sparse matrix determines the memory access patterns. So the performance greatly depends on input sparse matrix as well as on the its dimension.

Probably the numbers reported by you are normal. I need to look at the input data.

By the way the routine is OpenMP parallelized. Have you tested it in parallel mode by setting OMP_NUM_THREADS environment variable?

All the best

Sergey

0 Kudos
Reply