It seems that MKL_DIRECT_CALL does not apply to zgemm3 and cgemm3, only zgemm and cgemm
It seems slightly annoying that this is not the case as zgemm3 is significantly more numerically efficient for large matrices and is therefore a replacement for zgemm and it would have been helpful for MKL_DIRECT_CALL to switch calls of zgemm3 as well as zgemm for small matrices
To make sure we understand your request correctly, you use zgemm instead of zgemm3m if the matrices are smaller than 128, right? And, you would like MKL_DIRECT_CALL feature to do a similar transformation if the matrices are small enough?
In my existing code, I only call zgemm3 - never zgemm
I want to use MKL_DIRECT_CALL but since MKL_DIRECT_CALL does not support zgemm3, I have added wrapper code that calls zgemm3 for N>=128, and zgemm for N<128