- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using VTune threading to understand why some computationally intensive code that calls MKL is scaling poorly. I am using OpenMP tasks with the Intel 2023.2 compiler. Running on a 28 core i7 laptop ( Windows 11)
Threading analysis in VTune shows my threads in a "Waiting" state with the hover bubble showing
Sync Object: Manual Reset Event
If I then switch to "Bottom Up" with a grouping by Sync Object, I see at the top "Manual Reset Event" inside mkl_intel_thread.2 with a "wait time by utilization" of "poor" ( aka red)
I am not sure I understand, I am calling MKL 2022.2 LAPACK functions ( e.g. zggev) , but I would expect they can be called from multiple threads without blocking or causing these kind of issues.
Any suggestions?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
After some a lot of work digging into this, I will have to declare a "mea culpa". We have a conan/cmake/artifactory build solution and the CMake generation step for a visual studio project had a bug, the -Qiopenmp flag was not being set.
Andrew
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you share the MKL build options and VTune threading data as well?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi ,
After some a lot of work digging into this, I will have to declare a "mea culpa". We have a conan/cmake/artifactory build solution and the CMake generation step for a visual studio project had a bug, the -Qiopenmp flag was not being set.
Andrew
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page