I am working with a matrix multiplication of sizes A = 40 x 40 and B is 40 x 10k with MKL support functions "cblas_cgemm". It is taking a 30 milliseconds,
I have enabled mkl multithreading also, which I belive it is more.
I have read in internet that "MKL functions are optimized for generic matrix multiplications"..
Anybody agrees or disagrees with me.
Thanks in advance .
What is it that you want us to disagree with? It is quite plausible that there exists/existed some computer and some version of MKL with which the task that you stated took 30 ms to execute. Why is that worth discussing?