- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I might have found a bug in Intel MKL BLAS library (cblas_daxpy). I have 100.000+ line multithreaded+openmp C++ software project. It uses BLAS functions for linear algebra. The version number of Debian's MKL is 2020.4.304-4.
When running multithreaded reinforcement learning code, the Intel MKL gives wrong result in certain scalar*vector code (the neural network code seem to work fine). I updated to g++ 14 but it doesn't fix the problem. When I switched to OpenBLAS (slower) the results are again correct.
Now it is possible to have race condition or memory access errors in the complex multithreaded code. When I ran valgrind and gcc's sanitizers, I fixed many small problems in the code but there might still be errors in the actual code and not in MKL.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I updated to Intel MKL 2024.1.0-691 (oneapi-mkl debian package) and now it works correctly also with MKL.
So there might have been a bug in the earlier MKL versions but updating solved the problem
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tomas, thanks for letting us know, but actually we didn't see such daxpy related problem the last 5 years. Is it possible to get us the reproducer to check this case on our end?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Then it is likely there is some bug in my project.
I’m now implementing/testing new features (it seems reinforcement learning I’m using don’t work well enough, at least yet). Once features have been tested to function, I will start seriously fixing bugs and probably send you code to reproduce the error if there is error in the MKL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok, thanks for letting us know. This thread is closing. In the case if you will build the right reproducer, you might open the new forum thread.
-- Gennady
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page