Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

MKL_DIRECT_CALL and ICX

AndrewC
Neuer Beitragender III
2.258Aufrufe

Are there any limitations/caveats when using ICX ( Intel C++ Compiler 2022.2) and MKL_DIRECT_CALL. I am looking through some of the headers and notice some __INTEL_COMPILER blocks ( which is ICL specfic). It seems to be that ICX disables MKL_DIRECT_CALL. The snippet below is from the mkl_direct.h in MKL 2022.1.0. When MKL_DC_USE_C is 0, many of the direct calls are skipped. I am not sure why this is restricted to __INTEL_COMPILER. Simply forcing MKL_DC_USE_C to 1  for ICX seems to work just fine. I suppose it's not clear to me if MKL_DIRECT_CALL is abandonware now?

 

#ifdef __INTEL_COMPILER
#define MKL_DC_USE_C 1
#if (__INTEL_COMPILER <= 1500)
#define MKL_DC_POTRF_DISABLE 1
#else
#define MKL_DC_POTRF_DISABLE 0
#endif
#elif defined(__GNUC__)
#if defined(__STRICT_ANSI__) && !defined(__STDC_VERSION__)
#define MKL_DC_USE_C 0
#else
#define MKL_DC_USE_C 1
#endif
#define MKL_DC_POTRF_DISABLE 1
#else
#define MKL_DC_USE_C 0
#endif

.

0 Kudos
1 Lösung
Gennady_F_Intel
Moderator
2.028Aufrufe

Andrew,

Please check the latest version of MKL 2023 and let us know if the problem is still there.

Thanks,

Gennady


Lösung in ursprünglichem Beitrag anzeigen

5 Antworten
Gennady_F_Intel
Moderator
2.213Aufrufe

Andrew,

MKL_DIRECT_CALL is not abandonware. We need to check this version.

Checking the small gemm calls with/without direct call, I see the following perf results ( MKL v 2022.1.0 ) :

 icx --version

Intel(R) oneAPI DPC++/C++ Compiler 2022.1.0 (2022.1.0.20220316)

[2 x 2], SGEMM Execution Time == 6.798655e-08 sec

[2 x 2], JIT_SGEMM Execution Time == 4.284084e-08 sec 

....

[8 x 8], SGEMM Execution Time == 7.217750e-08 sec

[8 x 8], JIT_SGEMM Execution Time == 4.936010e-08 sec


that's mean direct call mode works with jit version of gemm as well.


-Gennady

 



AndrewC
Neuer Beitragender III
2.204Aufrufe

Code compiled  using ICX with MKL_DIRECT_CALL defined seems to skip the calls to the direct version (because  __INTEL_COMPILER is NOT defined)

As I mentioned in my original post,  the "variable" MKL_DC_USE_C is defined "0" unless __INTEL_COMPILER is defined

#define MKL_DC_GEMM3M_CHECKSIZE(m,n,k) (((*(m) <= 4 && *(n) <= 4 && *(k) <= 4)) && MKL_DC_USE_C)

Always evaluates to FALSE, so the direct call is never made - example below.

 

#define zgemm(transa,transb,m,n,k,alpha,a,lda,b,ldb,beta,c,ldc)  MKL_DC_ZGEMM_CONVERT(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
#define zgemm_(transa,transb,m,n,k,alpha,a,lda,b,ldb,beta,c,ldc) MKL_DC_ZGEMM_CONVERT(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)
#define ZGEMM(transa,transb,m,n,k,alpha,a,lda,b,ldb,beta,c,ldc)  MKL_DC_ZGEMM_CONVERT(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)

/* ZGEMM3M */
#define MKL_DC_ZGEMM3M_CONVERT(transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc)  do { \
    if (MKL_DC_GEMM3M_CHECKSIZE(m,n,k)) { \
        mkl_dc_zgemm((transa), (transb), (m), (n), (k), (alpha), (a), (lda), (b), (ldb), (beta), (c), (ldc));\
    } else {  \
        MKL_DIRECT_CALL_INIT_FLAG; \
        zgemm3m_direct((transa), (transb), (m), (n), (k), (alpha), (a), (lda), (b), (ldb), (beta), (c), (ldc), &mkl_direct_call_flag); \
    }\
} while (0)

 


Am I missing something here?

Gennady_F_Intel
Moderator
2.029Aufrufe

Andrew,

Please check the latest version of MKL 2023 and let us know if the problem is still there.

Thanks,

Gennady


AndrewC
Neuer Beitragender III
2.017Aufrufe

Will do! Thanks for following up.

AndrewC
Neuer Beitragender III
1.974Aufrufe

Hi Gennady,

Just checked 2023.0 and its clear that has been taken care of.

Thanks

Andrew

Antworten