- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am facing a segmentation fault while running MKL lapack routines. In order to understand the origin of the problem, I ran the program with valgrind (with the options
valgrind --leak-check=full --show-leak-kinds=all -s --track-origins=yes <executable>
and get the output attached with post (the slurm output file, please see from
line 274 onwards). I concentrate on the memory that is "definitely lost" (see around
line 420 onwards), and this seems to be due to calling of the mkl_lapack_dtrtri and
mkl_lapack_dgetri.
The only place in my code where I call LAPACKE_dgetrf and LAPACKE_dgetri is in
det.c (which I also attach). Please note that although the functions are called
atlasdet, atlasinv, .. etc, only mkl routines are used. (The naming convention is
because these routines were using ATLAS functions, which I then ported to MKL).
Any help in resolving this issue would be much appreciated.
Best,
Debasish
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Debasish,
Thank you for posting the issue.
Noticed you are using oneAPI 2023.2.1, can you try with oneAPI newer version(oneAPI 2025.0 for example). If the issue still exist with latest oneAPI version, to better investigate the issue, please share us a simple reproducer and the information about OS, HW, etc., as well as educate us how to reproduce your issue with the simple reproducer.
Regards,
Ruqiu
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Debasish,
Thank you for posting the issue.
Noticed you are using oneAPI 2023.2.1, can you try with oneAPI newer version(oneAPI 2025.0 for example). If the issue still exist with latest oneAPI version, to better investigate the issue, please share us a simple reproducer and the information about OS, HW, etc., as well as educate us how to reproduce your issue with the simple reproducer.
Regards,
Ruqiu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am sorry I did not have time to investigate the issue. Since the issue is only happening on very big problem sizes, I am unable to test this very easily. Moreover, from what my and my colleagues observe, if we allocate huge memory then we do not encounter the problem.
If I have a concrete reproducer code I will share it and raise the issue again.
Thanks and regards,
Debasish

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page