Hi,It is possible to be

Robert_P_2 · ‎03-30-2014

I am performing a sparse matrix vector multiplication using mkl_dcsrmv on a system with ~80,000 degrees of freedom. My matrix is symmetric, so as a first attempt I used the option "SLNCxx" for matdescra and passed in the lower triangular part only. This works fine and gives the correct answer, but on a E5-4650 machine with 32 cores the code maxes out at 8 threads. If I instead call mkl_dcsrmv with "GxxCxx" and pass in the full sparse matrix, the code scales up to 32 threads and completes in roughly half the time as the symmetric version. This code is running with MKL 11.1 packaged with Composer XE 2013 SP1 on Linux. Should I expect the symmetric version of mkl_dcsrmv to execute with fewer threads than the general version? Thank you for your advice.

Chao_Y_Intel · ‎03-30-2014

Hello,

Thanks for your report. By default Intel MKL may choose the threading number dynamically according to some factors, for example, the matrix types, and data size, CPU types. but use can also control the total threading numbers by use the following environment setting:

MKL_DYNAMIC=FALSE
MKL_NUM_THREADS= number of the threadings.

You can set MKL_DYNAMIC=FALSE to check if the symmetric can run more threadings.

Thanks,
Chao

Robert_P_2 · ‎03-31-2014

Thank you, that allows me to run the symmetric version with more threads. I am still seeing a roughly 2x performance hit with the symmetric version vs. the full, but this could be due to the internal algorithm employed?

Chao_Y_Intel · ‎04-13-2014

Hi,

It is possible to be related to the internal implementation. I will check with engineer owner for a few details.

Regards,
Chao

Chao_Y_Intel · ‎04-13-2014

Hi,

We talked with function owner. The current implementation for symmetric matrix-matrix multiplication has some overhead for the small matrix on the large core. So it is suggested to use the non-symmetric interfaces now.

Thanks for checking this.

Regards,
Chao

Robert_P_2 · ‎04-14-2014

This is very helpful, thank you. I will stick to the general version for now.

More Threads with G than S in mkl_dcsrmv?