Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Sesto__Dominic
Beginner
126 Views

Segfault when using cblas_dgemm()

When attempting to use the cblas_dgemm() function, I am experiencing a segmentation fault. The segfault occurs only when running the code on a Linux machine (works on a Windows machine).

The code is set up as follows:

#define ML_ROWS_A 5

#define ML_COLS_B 7

#define ML_COLS_A 6

double a_d[ML_ROWS_A*ML_COLS_A], b_d[ML_COLS_A*ML_COLS_B], c_d[ML_ROWS_A*ML_COLS_B];

const double  alpha = 1.0;

const double  beta  = 1.0;

int m = ML_ROWS_A;

int n = ML_COLS_B;

int k = ML_COLS_A;

cblas_dgemm( CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k, alpha, a_d, k, b_d, n, beta, c_d, n );

 

 

If I add the calls:

int n_alloc;

mkl_mem_stat(&n_alloc);


Before calling cblas_dgemm(), the function works correctly (though no memory has been allocated through mkl calls). Similarly, if I instead allocate memory through an mkl call (such as mkl_calloc) before calling cblass_dgemm() the function works correctly (even though I do not actually pass the allocated memory to the function call). Can anyone explain why the segfault is occurring and why calling certain mkl functions before cblass_dgemm remedies the issue?

0 Kudos
11 Replies
Ying_H_Intel
Employee
126 Views

Hi Sesto,

do you have a small code for the problem?   from the code,

int m = ML_ROWS_A;

int n = ML_COLS_B;

int n = ML_COLS_A;

​Seem you haven't lda =K defined and "n" was written two times?

Best Regards,

​Ying

Sesto__Dominic
Beginner
126 Views

Ying H,

That was just a typo when posting the question. I edited the original post to reflect this.

Thank You,

Dominic

 

Khang_N_Intel
Employee
126 Views

Hi Dominic,

We are investigating this issue.  It would be helpful if you can send us a sample code, not just a code snippet, so that we can recreate the issue?

Thank you,

Khang

Khang_N_Intel
Employee
126 Views

Hi Dominic,

The code snippet you provided didn't give any error.

As I have mentioned in the previous post, can you send us your entire code where you saw the error?

Thanks,

Khang

Khang_N_Intel
Employee
126 Views

Hi Dominic,

I forgot to mention that the dimensions you provided are too small to cause any seg fault.

Khang

Sesto__Dominic
Beginner
126 Views

Attached is an Eclipse project that demonstrates the segfault occurring when run on Linux 6.6. The call to cblas_dgemm functions correctly if it is the only test called. The call to cblas_sgemm also works correctly if it is the only test called. If the test that calls cblas_sgemm is called before cblas_dgemm, a segfault occurs.

Eclipse Details:

Eclipse IDE for C/C++ Developers

Version: Oxygen.3 Release (4.7.3)

 

 

 

Gennady_F_Intel
Moderator
126 Views

I extracted the *. and cpp code and checked how this works on my local RH7 system with latest available version of mkl - 2018.3

AVX based system:
 
icc -mkl -DINTEL_MKL_MATH MatrixMulTest.cpp
./a.out
MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors, Lnx 2.80GHz lp64 intel_thread
MKL_VERBOSE SGEMM(N,N,7,5,6,0x7ffd0e2ffc28,0x6412c0,7,0x641240,6,0x7ffd0e2ffc30,0x641380,7) 144.96ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:20
testFloatMatrixMul passed
MKL_VERBOSE DGEMM(N,N,7,5,6,0x7ffd0e2ffc10,0x6415c0,7,0x6414c0,6,0x7ffd0e2ffc18,0x641720,7) 33.26us CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:20
testDoubleMatrixMul passed
 
uname -a:
Linux iris 3.10.0-693.17.1.el7.x86_64 #1 SMP Sun Jan 14 10:36:03 EST 2018 x86_64 x86_64 x86_64 GNU/Linux
 
AVX-512 ( code name - SkyLake  - Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz ) :
$ ./a.out
MKL_VERBOSE Intel(R) MKL 2018.0 Update 3 Product build 20180406 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 512 (Intel(R) AVX-512) enabled processors, Lnx 2.70GHz lp64 intel_thread
MKL_VERBOSE SGEMM(N,N,7,5,6,0x7ffc843cea28,0x6412c0,7,0x641240,6,0x7ffc843cea30,0x641380,7) 151.81ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:48
testFloatMatrixMul passed
MKL_VERBOSE DGEMM(N,N,7,5,6,0x7ffc843cea10,0x6415c0,7,0x6414c0,6,0x7ffc843cea18,0x641720,7) 19.52ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:48
testDoubleMatrixMul passed
 
 
Sesto__Dominic
Beginner
126 Views

We built with GNU, not ICC. Additionally we are using the 2017 version of MKL, not the 2018 version.

Thank You

 

 

Gennady_F_Intel
Moderator
126 Views

i don't expect building by icc or gcc will affect, but i will check this later and let you know 

Sesto__Dominic
Beginner
126 Views

Some additional environment information:

Compiler: gcc 4.8.5

OS: CentOS Linux v7.4

We used the 2017 version of the MKL Library

 

 

 

 

Gennady_F_Intel
Moderator
126 Views

we recommend you to evaluate the latest mkl 2018.3 ( take this binaries from this site - https://software.intel.com/en-us/performance-libraries ) and check if the problem is still exists with this update too.

Reply