Fat and Narrow matrix multiplication with "cblas_cgemm"

sivasankar__mamilla · ‎04-25-2020

Hi,

I am working with a matrix multiplication of sizes A = 40 x 40 and B is 40 x 10k with MKL support functions "cblas_cgemm". It is taking a 30 milliseconds,

I have enabled mkl multithreading also, which I belive it is more.

I have read in internet that "MKL functions are optimized for generic matrix multiplications"..

Anybody agrees or disagrees with me.

Thanks in advance .

mecej4 · ‎04-28-2020

What is it that you want us to disagree with? It is quite plausible that there exists/existed some computer and some version of MKL with which the task that you stated took 30 ms to execute. Why is that worth discussing?