Parallel two medium size GEMM?

Mingfei_M_Intel — Sun, 15 Jul 2018 03:58:34 GMT

Hi,

i have a special use case which needs to compute two independent GEMMs.

each one with a MNK in the range of [20~4000], on Xeon skylake 8180, only reaching 600~700 GFlops/sec.

from the algorithm level, the two GEMMs has no denpendency, so they can be launched in parallel.

how can i parallel these two GEMMs? say one socket for each one, perhaps. i suppose i can't use batch GEMM for this.

Hi, in that case you may try

Gennady_F_Intel — Mon, 16 Jul 2018 02:29:34 GMT

Hi, in that case you may try to explicitly set MKL_NUM_THREADS=#of physical treads/2 and try to call gemm at the same time. You need also properly set the affinity mask to avoid threads migration: export KMP_AFFINITY=compact,1,0,granularity=fine

topic Parallel two medium size GEMM? in Intel® oneAPI Math Kernel Library

Parallel two medium size GEMM?

Hi, in that case you may try