Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6539 Discussions

## Multi-thread MKL cblas_sgemm with g++ problem Beginner
371 Views

Here's an example of sgemm program.

```#include <mkl.h>
#include <iostream>
#include <cstdlib>
#define ITERATION 1

int main()
{
int ra = 128;
int lda = 75;
int ldb = 55;
float* left = (float*)calloc(ra * lda, sizeof(float));
float* right = (float*)calloc(ldb * lda, sizeof(float));
float* ans = (float*)calloc(ra * ldb, sizeof(float));
std::cout << "left " << std::endl;
for (int i = 0; i < ra; ++i) {
for (int j = 0; j < lda; ++j) {
left[i * lda + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
std::cout << left[i * lda + j] << " ";
}
std::cout << std::endl;
}

std::cout << "right " << std::endl;
for (int i = 0; i < lda; ++i) {
for (int j = 0; j < ldb; ++j) {
right[i * ldb + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX);
std::cout << right[i * ldb + j] << " ";
}
std::cout << std::endl;
}

for (int i = 0; i < ITERATION; ++i) {
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, ra, ldb, lda, 1.0f, left, lda,
right, ldb, 0.0f, ans, ldb);
}

std::cout << "ans " << std::endl;
for (int i = 0; i < ra; ++i) {
for (int j = 0; j < ldb; ++j) {
std::cout << ans[i * ldb + j] << " ";
}
std::cout << std::endl;
}

return 0;
}
```

I compile this program with g++ by options `-fopenmp -lmkl_rt`, where `OMP_NUM_THREADS` has been set to 16.

After running the program, I figure out that the answer is exactly wrong comparing to the matlab result. I wouldn't say wrong if there's only few accuracy errors. Further, I observe that the program performs well under these conditions:

1. Use icc instead of g++,
2. Remove -fopenmp flag,
3. Use g++&atlas instead of icc&mkl

Therefore, I guess the problem may lay on the `-fopenmp` flag. Can you help me figure out the problem? Thank you!

g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)

icc (ICC) 16.0.3 20160415

Linux core 2.6.32-279.el6.x86_64

3 Replies Employee
371 Views

Hi Keren,

Thanks for raise the question here.    the problem looks be here.

mkl_rt is Intel MKL Single Dynamic Library (SDL).    As mkl user guide explained : the  SDL enables you to select the interface and threading library for Intel MKL at run time. By default, linking
with SDL provides:
• Intel LP64 interface on systems based on the Intel® 64 architecture
• Intel interface on systems based on the IA-32 architecture
To use other interfaces or change threading preferences, including use of the sequential version of Intel MKL,
you need to specify your choices using functions or environment variables as explained in section
Dynamically Selecting the Interface and Threading Layer.

So if you compiler it with GNU G++ and GNU openmp threading,  Could you please try to  export :

export MKL_INTERFACE_LAYER = GNU

then run your exe and see if it can get  expected result.

Best Regards,
Ying Beginner
371 Views

I tried to export MKL_INTERFACE_LAYER=GNU, but it does not help.

Alternatively, I choose to link `-lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lpthread -ldl` instead of the single dynamic library, and it turns out that problems are solve by doing this. Employee
371 Views

Hi Keren, 