- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Interestingly, I've been unable to find an answer to this simple question. What is the algorithm that is used for matrix-matrix multiplications (e.g., DGEMM) in MKL? Is is classical (O(N^3)), Strassen (O(N^2.7)), or something else? Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Raul,
I am afraid BLAS standard gemm uses classical O(N3), for algorithm design, you could follow Netlib gemm source code. Intel MKL optimized BLAS routines with SIMD instruction sets, do some work to fit data into the caches enabling contiguous, aligned accesses.
Here's another algorithm for matrix matrix multiplication, call 3M. It split a complex matrix into two matrices, performs 3 GEMM and 4 matrix additions. For other algorithm, like Winograd which implemented for NN convolution kernel in MKL-DNN.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page