Community
cancel
Showing results for 
Search instead for 
Did you mean: 
guillaumekln
Beginner
398 Views

Execution gets stuck in batch GEMM when using AVX and libgomp

Hi,

We are facing an issue with the function cblas_sgemm_batch_strided from oneMKL 2021.1. The execution gets stuck in this function when the code is run on a Intel CPU with AVX and compiled with GNU OpenMP (tested on Ubuntu 18.04 and CentOS 7). The same code used to work in Intel MKL 2020.4.

The issue can be reproduced by setting MKL_CBWR=AVX (see below).

 

Code to reproduce:

 

 

#include <mkl.h>

int main() {
  const MKL_INT batch_size = 256;

  const CBLAS_TRANSPOSE transa = CblasNoTrans;
  const CBLAS_TRANSPOSE transb = CblasTrans;

  const MKL_INT m = 1;
  const MKL_INT n = 1;
  const MKL_INT k = 64;

  const MKL_INT lda = k;
  const MKL_INT ldb = k;
  const MKL_INT ldc = n;

  const MKL_INT stridea = m * k;
  const MKL_INT strideb = k * n;
  const MKL_INT stridec = m * n;

  const float alpha = 1;
  const float beta = 0;

  const float* a = new float[batch_size * m * k];
  const float* b = new float[batch_size * n * k];
  float* c = new float[batch_size * m * n];

  cblas_sgemm_batch_strided(CblasRowMajor,
                            transa, transb,
                            m, n, k,
                            alpha,
                            a, lda, stridea,
                            b, ldb, strideb,
                            beta,
                            c, ldc, stridec,
                            batch_size);

  delete [] a;
  delete [] b;
  delete [] c;
}

 

 

 

Compilation (Ubuntu 18.04):

 

 

MKLROOT=/opt/intel/oneapi/mkl/2021.1.1
g++ -o gemm_batch gemm_batch.cc -L${MKLROOT}/lib/intel64 -Wl,--no-as-needed -lmkl_intel_ilp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -lm -ldl -DMKL_ILP64 -m64 -I"${MKLROOT}/include"

 

 

 

Execution:

 

 

MKL_CBWR=AVX LD_LIBRARY_PATH=${MKLROOT}/lib/intel64 ./gemm_batch

 

 

 

Thanks for looking into this issue,

Guillaume

Labels (1)
0 Kudos
7 Replies
Gennady_F_Intel
Moderator
360 Views

Guillaume,

Is that possible to check if the problem will still exist with Intel OMP threading? ( libmkl_intel_thread)



Gennady_F_Intel
Moderator
353 Views

I checked and manage to reproduce the problem on my end. the problem happens with any threading runtime libraries. The issue would be investigated and the tread would keep updated as soon as possible.

 

guillaumekln
Beginner
322 Views

Hi Gennady,

Thanks for looking into this issue.

According to my tests the issue happens only with GNU OpenMP and not with Intel OpenMP.

Gennady_F_Intel
Moderator
312 Views

it seems that the behavior depends on the specific CPU type as the problem happens when we change the code path ( by using MKL_CBWR).


guillaumekln
Beginner
211 Views

Hi Gennady,

I'm just wondering if we can expect a fix to be included in oneMKL 2021.2?

 

Also to complete the first post, we always set MKL_CBWR=AUTO,STRICT when running our application. So I guess it turns into MKL_CBWR=AVX,STRICT on AVX systems.

As a workaround, we changed the OpenMP runtime to Intel and it seems to work for us.

Gennady_F_Intel
Moderator
200 Views

Hi Guillaume,

I don't see this issue fixed into the coming 2021 update2 and very likely the fix is targeting to the update3.


Gennady_F_Intel
Moderator
132 Views

update - the fix of the issue is targeted to be available in the next update of oneMKL. We will keep this thread updated with the status of this release.


Reply