Some additional environment

Sesto__Dominic · ‎06-28-2018

When attempting to use the cblas_dgemm() function, I am experiencing a segmentation fault. The segfault occurs only when running the code on a Linux machine (works on a Windows machine).

The code is set up as follows:

#define ML_ROWS_A 5

#define ML_COLS_B 7

#define ML_COLS_A 6

double a_d[ML_ROWS_A*ML_COLS_A], b_d[ML_COLS_A*ML_COLS_B], c_d[ML_ROWS_A*ML_COLS_B];

const double alpha = 1.0;

const double beta = 1.0;

int m = ML_ROWS_A;

int n = ML_COLS_B;

int k = ML_COLS_A;

cblas_dgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, a_d, k, b_d, n, beta, c_d, n );

If I add the calls:

int n_alloc;

mkl_mem_stat(&n_alloc);

Before calling cblas_dgemm(), the function works correctly (though no memory has been allocated through mkl calls). Similarly, if I instead allocate memory through an mkl call (such as mkl_calloc) before calling cblass_dgemm() the function works correctly (even though I do not actually pass the allocated memory to the function call). Can anyone explain why the segfault is occurring and why calling certain mkl functions before cblass_dgemm remedies the issue?

Ying_H_Intel · ‎06-29-2018

Hi Sesto,

do you have a small code for the problem? from the code,

int m = ML_ROWS_A;

int n = ML_COLS_B;

int n = ML_COLS_A;

Seem you haven't lda =K defined and "n" was written two times?

Best Regards,

Ying

Sesto__Dominic · ‎07-02-2018

Ying H,

That was just a typo when posting the question. I edited the original post to reflect this.

Thank You,

Dominic

Khang_N_Intel · ‎07-02-2018

Hi Dominic,

We are investigating this issue. It would be helpful if you can send us a sample code, not just a code snippet, so that we can recreate the issue?

Thank you,

Khang

Khang_N_Intel · ‎07-03-2018

Hi Dominic,

The code snippet you provided didn't give any error.

As I have mentioned in the previous post, can you send us your entire code where you saw the error?

Thanks,

Khang

Khang_N_Intel · ‎07-03-2018

Hi Dominic,

I forgot to mention that the dimensions you provided are too small to cause any seg fault.

Khang

Sesto__Dominic · ‎07-18-2018

Attached is an Eclipse project that demonstrates the segfault occurring when run on Linux 6.6. The call to cblas_dgemm functions correctly if it is the only test called. The call to cblas_sgemm also works correctly if it is the only test called. If the test that calls cblas_sgemm is called before cblas_dgemm, a segfault occurs.

Eclipse Details:

Eclipse IDE for C/C++ Developers

Version: Oxygen.3 Release (4.7.3)

Gennady_F_Intel · ‎07-18-2018

I extracted the *. and cpp code and checked how this works on my local RH7 system with latest available version of mkl - 2018.3

AVX based system:

icc -mkl -DINTEL_MKL_MATH MatrixMulTest.cpp

./a.out

MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors, Lnx 2.80GHz lp64 intel_thread

MKL_VERBOSE SGEMM(N,N,7,5,6,0x7ffd0e2ffc28,0x6412c0,7,0x641240,6,0x7ffd0e2ffc30,0x641380,7) 144.96ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:20

testFloatMatrixMul passed

MKL_VERBOSE DGEMM(N,N,7,5,6,0x7ffd0e2ffc10,0x6415c0,7,0x6414c0,6,0x7ffd0e2ffc18,0x641720,7) 33.26us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:20

testDoubleMatrixMul passed

uname -a:

Linux iris 3.10.0-693.17.1.el7.x86_64 #1 SMP Sun Jan 14 10:36:03 EST 2018 x86_64 x86_64 x86_64 GNU/Linux

AVX-512 ( code name - SkyLake - Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz ) :

$ ./a.out

MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.70GHz lp64 intel_thread

MKL_VERBOSE SGEMM(N,N,7,5,6,0x7ffc843cea28,0x6412c0,7,0x641240,6,0x7ffc843cea30,0x641380,7) 151.81ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:48

testFloatMatrixMul passed

MKL_VERBOSE DGEMM(N,N,7,5,6,0x7ffc843cea10,0x6415c0,7,0x6414c0,6,0x7ffc843cea18,0x641720,7) 19.52ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:48

testDoubleMatrixMul passed

Sesto__Dominic · ‎07-19-2018

We built with GNU, not ICC. Additionally we are using the 2017 version of MKL, not the 2018 version.

Thank You

Gennady_F_Intel · ‎07-19-2018

i don't expect building by icc or gcc will affect, but i will check this later and let you know

Sesto__Dominic · ‎07-25-2018

Some additional environment information:

Compiler: gcc 4.8.5

OS: CentOS Linux v7.4

We used the 2017 version of the MKL Library

Gennady_F_Intel · ‎07-25-2018

we recommend you to evaluate the latest mkl 2018.3 ( take this binaries from this site - https://software.intel.com/en-us/performance-libraries ) and check if the problem is still exists with this update too.

Segfault when using cblas_dgemm()