Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7176 Discussions

Program exits obruptly for a specific matrix-vector product in MKL 2023.1.0

Fadime
Beginner
369 Views

We are having an issue where the matrix-vector product using sparse_dot_mkl cannot be carried out for a very specific matrix. Attached is the script to carry out the product and the matrix that gives the problem.

 

The problem occurs for MKL version: 2023.1.0 on some of our Windows computers with Intel processors (strangely, not on all such computers). The problem disappears for MKL versions starting from 2024.2.2. 

 

Is this a known issue with MKL versiosn below 2024.2.2? Are there some known specific circumstances where the issue occurs (for example when matrix has certain properties)? 

 

In case this is relevant, below are some other specifications of my computer and conda environment where I see the issue:

- scipy version: 1.15.1

- sparse_dot_mkl (https://pypi.org/project/sparse-dot-mkl/) version: 0.9.8

- My processor is the following: Processor 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz, 2401 Mhz, 4 Core(s), 8 Logical Processor(s)

 

Any insights are appreciated!

0 Kudos
5 Replies
Fengrui
Moderator
284 Views

Hi,

 

It looks the reproducer uploading is stuck at virus scan for a while. Could you please trying uploading it again?

 

Thanks,

Fengrui

0 Kudos
Fengrui
Moderator
208 Views

Hi,

 

The download issue finally got resolved. I tried the reproducer on my end (11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz), and it worked well with oneMKL 2024.2.2 but not with the 2024.2.1 or 2023.1.0 release. We did have some changes in the sparse BLAS functions in the 2024.2.2 release. Those might have fixed this issue. 

 

I will try to create a C code with the same data for further investigation.

 

Best,

Fengrui

0 Kudos
Fengrui
Moderator
167 Views

The C reproducer is attached. I tried a couple of machines with both Linux and Windows OS and with different code paths. The issue seems only affect Windows OS with AVX512 code path. 

 

BTW, is there any specific reason to use the 2023.1.0 version rather than the latest 2025.1.0 release? We keep adding optimizations and doing bug fix for new releases.

 

Thanks,

Fengrui

 

 

0 Kudos
Fadime
Beginner
26 Views

Sorry for the delayed reply and thank you for carrying out the detailed tests! I indeed have Windows OS with AVX512 (which is one of the computers where the issue occurs).

 

As one of my collegues has explained, we are using a specific version of MKL (2023.1.0) rather than the latest release due to issues we have encountered in our Python code. Specifically, we are seeing OpenMP-related errors (e.g., error #15 pasted below) that only appear on certain machines—so far, only those with AMD CPUs. These issues can result in incorrect outputs or crashes. Reverting to the older MKL version has helped mitigate these problems, at least partially.

Additionally, we're aiming to maintain compatibility across a range of operating systems, and to our knowledge, versions of MKL newer than 2023 are not currently available for macOS.

 

Error #15 mentioned above:

OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.
OMP: Hint This means that multiple copies of the OpenMP runtime have been linked into the program. That is dangerous, since it can degrade performance or cause incorrect results. The best thing to do is to ensure that only a single OpenMP runtime is linked into the process, e.g. by avoiding static linking of the OpenMP runtime in any library. As an unsafe, unsupported, undocumented workaround you can set the environment variable KMP_DUPLICATE_LIB_OK=TRUE to allow the program to continue to execute, but that may cause crashes or silently produce incorrect results. For more information, please see http://www.intel.com/software/products/support/.

0 Kudos
Fengrui
Moderator
4 Views

Thank you for the information! 

Yes, for oneMKL versions >= 2024.0, macOS is not supported. For other OS, we always encourage the users to move to new versions (latest is the best). One of the reasons is that issues reported in old versions will likely only be fixed in newer releases.

For the OMP error, did you check if any dependency of your code is linked to OMP runtime statically? When reverting to old oneMKL version, the corresponding OMP runtime might be compatible with the OMP runtime used by that dependency. I have to say it is just my guess:). It will be helpful if you could share a reproducer for the OMP error.

 

Thanks,

Fengrui

0 Kudos
Reply