Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7245 Discussions

Bug in sparse libraries in (Fortran) MKL 2025.2 and 2025.3 versions

yxddyxzh
Novice
886 Views

I find that several bugs, at least feature variations, exist in (Fortran) MKL 2025.2 and 2025.3 versions:

1 MKL_CREATE_COO

the 2025.1 version sum terms with the same index up, while 2025.2 and 2025.3 do not.

2 MKL_CONVERT_CSR

it works with small arrays, while crashes with no warning (sometimes access violation) with large arrays.

3 MKL_EXPORT_CSR

With large arrays, it exports first several row indexes correctly, but it exports wrong results afterwards.

 

I feel that these newer versions are not suitable for large-scale scientific computations.

1 Solution
noffermans
Employee
519 Views

Dear yxddyxzh,

 

Thank you for bringing this to our attention. It is indeed an issue with the mkl_sparse_convert_csr API, which fails to sort or reduce duplicate entries in the 2025.2 and 2025.3 releases. The issue will be fixed in the 2026.0 release. In the meantime, we suggest two possible solutions:

  •  Keep using the same API from the 2025.1 release or before.
  • You may want to write your own summation routine for duplicate entries following the process below:
    1. Convert, order and export the CSR matrix
      Use mkl_sparse_convert_csr to convert the COO matrix to CSR format as you do now. Use mkl_sparse_order to sort the CSR matrix. Then export the resulting arrays using mkl_sparse_?_export_csr. The exported arrays will be in sorted CSR format but may contain duplicate entries.
    2. Preprocess row pointers
      Allocate ia_new with length nrows + 1. For each row, determine the number of unique nonzero entries from the ia and ja arrays (this step can be parallelized). Store the result for row k in ia_new[k+1], set ia_new[0] = index, and compute the inclusive prefix sum of ia_new. The total number of unique nonzeros is then nnz_new = ia_new[nrows] - index.
    3. Allocate output arrays
      Allocate ja_new and a_new with length nnz_new.
    4. Fill and accumulate values
      Using ia_new as row pointers, populate ja_new and a_new (potentially in parallel). For each row, accumulate values corresponding to duplicate column indices—since the entries are sorted within a row, duplicates can be efficiently detected and combined.

 

For information, here is a past forum post about a Fortran implementation for a summation routine for duplicate COO entries (that version is not in parallel but functional): https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/About-COO-with-duplicate-entries-and-MKL-sparse-export-csr/m-p/1160741

 

Best,
Nicolas

View solution in original post

0 Kudos
6 Replies
yxddyxzh
Novice
875 Views

People who want to further investigate this issue may refer to the following code:

 

with MKL 2025.1, it works well.

 

with MKL 2025.3, it craches at LINE 404, which is caused by wrong output of

call mkl_to_csr(md%a_mkl,nsys,nnzsys,vsys,isys,jsys)

at LINE 400.

sc_Intel
Employee
820 Views

Hello,

 

Thank you for sharing the source code. 

 

While reviewing the test code for the mkl_to_csr subroutine, I noticed two areas that may require further attention:

  • Uninitialized Variable:

In line 529 of DCFI_COR.f90, the line !index = SPARSE_INDEX_BASE_ONE is commented out. This leaves the index variable uninitialized, which could lead to undefined behavior during runtime.

  • Potential Overlapping Memory Writes:

Lines 538–539 appear to perform assignments that may result in overlapping writes:

i_csr (1:n_csr) = isys_1

i_csr (2:n_csr+1) = isys_2

isys_1 and isys_2 are both pointers to arrays of size n_csr, but the second assignment tries to write to i_csr(2:n_csr+1). Indices 2 through n_csr overlap, causing the values from the first assignment to be overwritten. You may try if this works for you:

i_csr(1:n_csr) = isys_1(1:n_csr)

i_csr(n_csr+1) = isys_2(n_csr)

 

Thanks!

 

0 Kudos
yxddyxzh
Novice
800 Views

Thanks for your quick reply.

 

The program should be run with the following Command Arguments: -f mod1

 

I tried your suggestions, but the problem keeps unchanged.

The Uninitialized Variable suggestion: index is an output variable of mkl_sparse_?_export_csr, it is also undefined in Official examples.

The Potential Overlapping Memory Writes suggestion: I tried this yet problem remains.

sc_Intel
Employee
700 Views

Hello @yxddyxzh, Thank you for reporting this issue. I’ve been able to reproduce the crash on my end and confirmed that it occurs within the PARDISO routine. We've logged the issue internally and will keep you updated as we make progress.

noffermans
Employee
520 Views

Dear yxddyxzh,

 

Thank you for bringing this to our attention. It is indeed an issue with the mkl_sparse_convert_csr API, which fails to sort or reduce duplicate entries in the 2025.2 and 2025.3 releases. The issue will be fixed in the 2026.0 release. In the meantime, we suggest two possible solutions:

  •  Keep using the same API from the 2025.1 release or before.
  • You may want to write your own summation routine for duplicate entries following the process below:
    1. Convert, order and export the CSR matrix
      Use mkl_sparse_convert_csr to convert the COO matrix to CSR format as you do now. Use mkl_sparse_order to sort the CSR matrix. Then export the resulting arrays using mkl_sparse_?_export_csr. The exported arrays will be in sorted CSR format but may contain duplicate entries.
    2. Preprocess row pointers
      Allocate ia_new with length nrows + 1. For each row, determine the number of unique nonzero entries from the ia and ja arrays (this step can be parallelized). Store the result for row k in ia_new[k+1], set ia_new[0] = index, and compute the inclusive prefix sum of ia_new. The total number of unique nonzeros is then nnz_new = ia_new[nrows] - index.
    3. Allocate output arrays
      Allocate ja_new and a_new with length nnz_new.
    4. Fill and accumulate values
      Using ia_new as row pointers, populate ja_new and a_new (potentially in parallel). For each row, accumulate values corresponding to duplicate column indices—since the entries are sorted within a row, duplicates can be efficiently detected and combined.

 

For information, here is a past forum post about a Fortran implementation for a summation routine for duplicate COO entries (that version is not in parallel but functional): https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/About-COO-with-duplicate-entries-and-MKL-sparse-export-csr/m-p/1160741

 

Best,
Nicolas

0 Kudos
yxddyxzh
Novice
378 Views

Thanks a lot for carefully debugging my code! I have been using MKL libraries for several years, and it performs really well in sparse system solutions. I appreciate your long-term dedication.

0 Kudos
Reply