- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here's an example of sgemm program.
#include <mkl.h> #include <iostream> #include <cstdlib> #define ITERATION 1 int main() { int ra = 128; int lda = 75; int ldb = 55; float* left = (float*)calloc(ra * lda, sizeof(float)); float* right = (float*)calloc(ldb * lda, sizeof(float)); float* ans = (float*)calloc(ra * ldb, sizeof(float)); std::cout << "left " << std::endl; for (int i = 0; i < ra; ++i) { for (int j = 0; j < lda; ++j) { left[i * lda + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX); std::cout << left[i * lda + j] << " "; } std::cout << std::endl; } std::cout << "right " << std::endl; for (int i = 0; i < lda; ++i) { for (int j = 0; j < ldb; ++j) { right[i * ldb + j] = static_cast <float> (rand()) / static_cast <float> (RAND_MAX); std::cout << right[i * ldb + j] << " "; } std::cout << std::endl; } for (int i = 0; i < ITERATION; ++i) { cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, ra, ldb, lda, 1.0f, left, lda, right, ldb, 0.0f, ans, ldb); } std::cout << "ans " << std::endl; for (int i = 0; i < ra; ++i) { for (int j = 0; j < ldb; ++j) { std::cout << ans[i * ldb + j] << " "; } std::cout << std::endl; } return 0; }
I compile this program with g++ by options `-fopenmp -lmkl_rt`, where `OMP_NUM_THREADS` has been set to 16.
After running the program, I figure out that the answer is exactly wrong comparing to the matlab result. I wouldn't say wrong if there's only few accuracy errors. Further, I observe that the program performs well under these conditions:
- Use icc instead of g++,
- Remove -fopenmp flag,
- Use g++&atlas instead of icc&mkl
- Set OMP_NUM_THREADS=1
Therefore, I guess the problem may lay on the `-fopenmp` flag. Can you help me figure out the problem? Thank you!
g++ (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)
icc (ICC) 16.0.3 20160415
Linux core 2.6.32-279.el6.x86_64
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Keren,
Thanks for raise the question here. the problem looks be here.
mkl_rt is Intel MKL Single Dynamic Library (SDL). As mkl user guide explained : the SDL enables you to select the interface and threading library for Intel MKL at run time. By default, linking
with SDL provides:
• Intel LP64 interface on systems based on the Intel® 64 architecture
• Intel interface on systems based on the IA-32 architecture
• Intel threading
To use other interfaces or change threading preferences, including use of the sequential version of Intel MKL,
you need to specify your choices using functions or environment variables as explained in section
Dynamically Selecting the Interface and Threading Layer.
So if you compiler it with GNU G++ and GNU openmp threading, Could you please try to export :
export MKL_INTERFACE_LAYER = GNU
MKL_THREADING_LAYER = GNU
then run your exe and see if it can get expected result.
Best Regards,
Ying
Other related On-line article about
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried to export MKL_INTERFACE_LAYER=GNU, but it does not help.
Alternatively, I choose to link `-lmkl_intel_lp64 -lmkl_core -lmkl_gnu_thread -lpthread -ldl` instead of the single dynamic library, and it turns out that problems are solve by doing this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Keren,
Glad you try them out.
Just add comments so other may understand
Export MKL_INTERFACE_LAYER = GNU + export MKL_THREADING_LAYER = GNU
-lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lpthread -ldl
Best Regards,
Ying
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page