Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
6592 Discussions

MKL Performance issue in threaded application

Xiaohui_Z_Intel
Employee
142 Views

Hi

We are working on RNN kernel optimization and we are trying to parallel 2 SGEMM on 2 socket SKX6148 server( 20 core per socket).

The SGEMM size is M = 20, N = 2400, K = 800.

Our target is to map the first SGEMM to socket0 and the other SGEMM to socket1.

We measured the GFLOPS with this benchmark(https://github.com/xhzhao/GemmEfficiency/tree/tbb), and got the following performance data:

I found that the performance of OMP+MKL or TBB MKL is not as good as we expect, and i'm not sure if i miss something with MKL in threaded application.

BTW, the pthread+MKL solution is not suitable for our real case , as it will double the threads and make the performance even worse.

Thanks in advance.

0 Kudos
0 Replies
Reply