Starting MKL 2020 Update 2, we are seeing significant dgemm performance regression on a multi-threaded application on a 32 core Intel Xeon processor. MKL is running in sequential mode (-mkl=sequential) and the Intel libraries are statically linked (-intel-static). The performance regression seems more obvious when the processor is heavily loaded.
MKL 2020 Update 1 doesn't have this issue.
Attached is a sample code to reproduce the issue.
We have a dedicated forum for MKL(https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/bd-p/oneapi-math-kernel-library). We are redirecting this query to that forum.
Did you check the latest version 2020 u4 which has been released the last Friday?
What do you mean by significant perf regression?
Are there any specific CPU types where do you see this regression?
I can reproduce the performance regression in MKL 2020 Update 4. Last working version was MKL 2020 Update 1.
On running the attached code which basically runs 10 threads running some dgemm calls in a loop, following are the results based on the time taken in the dgemm calls that is printed as an output.
1. Intel Xeon 32 core: MKL 2020 Update 4 is about 4 times worse than MKL 2020 Update 1
2. Intel Xeon 18 core: MKL 2020 Update 4 is about 1.8 times worse than MKL 2020 Update 1
3. Intel Xeon 4 core: MKL 2020 Update 4 is about 1.2 times worse than MKL 2020 Update 1
Let me know if you have any other questions. Thanks!
Following up on this issue. Was wondering if you were able to reproduce the performance regression and when can we expect this to be fixed?
Do let me know if you need any assistance from me.