Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
7220 Discussions

2025.1-PARDISO: Sig11 in numerical factorization

rbunger
New Contributor I
7,854 Views

Hi,

 

unfortunately I do not have a MWE showing the problem: In some cases I see a segmentation fault in the 2025.1-PARDISO numerical factorization. It does not always crash. I do not see the problem with the 2024.2.1-PARDISO. The matrix type is complex unsymmetric. I know, the description is not specific enough...

 

Thanks,

Rainer

0 Kudos
22 Replies
Fengrui
Moderator
7,604 Views

Hi Rainer, 

 

It will also be helpful if you could share the matrix that was used in the segmentation fault case.

 

Thanks,

Fengrui

rbunger
New Contributor I
7,594 Views

Hi Fengrui,

 

thanks for your interest. Up to now I can say that it happens with very large matrices so I think I can't upload them. Also the problem is not reproducible. It does not depend on the compiler I use for the built (GCC or icpx). I have the problem under observation, I will post an update when I can narrow it down a little bit further...

 

Thanks

 

Rainer

0 Kudos
c_sim
Employee
7,580 Views

Hi Rainer, 

 

Thank you for posting.

Are you working with LP64 or ILP64 interface? If you are using LP64, could you check the ILP64 interface? Because in 2025.1 version there was a slight change that might increase the non-zeros of LU factors slightly. Since your matrices are large, there might be an integer overflow with LP64.

 

Kind Regards,

Chris

rbunger
New Contributor I
7,575 Views

Hi Chris,

 

I'm using the ILP64 interface.

 

Rainer

0 Kudos
c_sim
Employee
7,545 Views

Thank you for the prompt response. In that case, please let us know when you can narrow down the problem a bit more. Ideally, it would be helpful to know the matrix and iparm settings.

rbunger
New Contributor I
7,475 Views

Hi,

I have updated my original posting:

community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/2025-1-PARDISO-Sig11-in-numerical-factorization/m-p/1682011#M37056 

The error has just occurred with a complex symmetric matrix in phase=11, iparm[1] = 3, iparm[34] = 1.

 

Thanks

 

Rainer

0 Kudos
rbunger
New Contributor I
7,475 Views

Hi,

 

I don't know if I just mixed it up the first time when I reported the bug or if the Sig11 just occurs at different stages: I now see the Sig11 with a complex symmetric matrix in the "Analysis" step ("phase" = 11). I also have a complex unsymmetrical matrix to solve but the error just occurred while processing the symmetric one.  Also, I call PARDISO_64 instead of PARDISO.

 

Thanks

 

Rainer

 

"iparms" & "phase", apart from the default settings, are as follows (C-indexing):

iparm[1] = 3;
iparm[34] = 1;
MKL_INT64 phase = 11;
0 Kudos
Shiquan_Su
Moderator
7,175 Views

Hi, Rainer:

We need a code example to further investigate the issue. I came up with the following test code template and compilation command. Would you please replace/fill in your case and show your test result? Thanks.

 

In file pardiso_example.c:

 

#include <stdio.h>
#include <stdlib.h>
#include "mkl.h"

int main() {
// Define matrix A in CSR format
MKL_INT n = 5; // Number of equations
MKL_INT ia[6] = {1, 4, 6, 8, 9, 10}; // Row pointers
MKL_INT ja[9] = {1, 2, 3, 1, 3, 2, 4, 3, 5}; // Column indices
double a[9] = {10.0, 2.0, 3.0, 3.0, 4.0, 1.0, 5.0, 2.0, 1.0}; // Non-zero values

// Right-hand side vector b
double b[5] = {1.0, 2.0, 3.0, 4.0, 5.0};
double x[5]; // Solution vector

// PARDISO control parameters
MKL_INT mtype = 11; // Real unsymmetric matrix
MKL_INT nrhs = 1; // Number of right-hand sides
void *pt[64] = {0}; // Internal solver memory pointer
MKL_INT iparm[64] = {0}; // Solver control parameters
MKL_INT maxfct = 1, mnum = 1, phase = 13, error = 0, msglvl = 0;

// Initialize iparm
iparm[0] = 1; // No solver default
iparm[1] = 2; // Fill-in reordering from METIS
iparm[7] = 2; // Max number of iterative refinement steps
iparm[9] = 13; // Perturb the pivot elements with 1E-13
iparm[10] = 1; // Use nonsymmetric permutation and scaling MPS
iparm[17] = -1; // Output: Number of nonzeros in the factor LU
iparm[18] = -1; // Output: Mflops for LU factorization
iparm[19] = 0; // Output: Numbers of CG Iterations

// Call PARDISO_64
PARDISO_64(pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, NULL, &nrhs, iparm, &msglvl, b, x, &error);

// Check for errors
if (error != 0) {
printf("ERROR during PARDISO_64 execution: %d\n", error);
return 1;
}

// Output solution
printf("Solution vector x:\n");
for (int i = 0; i < n; i++) {
printf("%f\n", x[i]);
}

// Release internal memory
phase = -1; // Release memory
PARDISO_64(pt, &maxfct, &mnum, &mtype, &phase, &n, a, ia, ja, NULL, &nrhs, iparm, &msglvl, b, x, &error);

return 0;
}

 

compile and link command:

icx -o pardiso_example pardiso_example.c -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -ldl

rbunger
New Contributor I
7,084 Views

Hi Shiquan,

 

many thanks for your efforts. I know this template. In my code I have more or less followed the templates I found. I essentially have a PARDISO interface in my sparse matrix class. The sparse matrix is used in different stages of my code, for different matrices. In some cases, it just crashes inside PARDISO with oneAPI-2025.1.0 (and 2025.1.1). The code uses MPI parallelization and only the bigger cases crash. I can't build with debug options as it would take forever to reach the point where it crashes. All I could say is that it does not crash with 2024.2.1 and older. I don't know how to put that into that template code.

Thanks,

 

Rainer

 

---

My code to do the symbolic factorization (although it might not be very helpful):

PD_mtype = mtype;
PD_nrhs = nrhs;
MKL_Create_CSR_ColsAndRows(); // own code to convert my sparse sparse format into the one required by oneAPI
PD_pt = std::make_unique<MKL_INT64[]>(64);
PD_iparm = std::make_unique<MKL_INT64[]>(64);
MKL_INT PD_mtype_int = PD_mtype;
std::unique_ptr<MKL_INT[]> PD_iparm_int = std::make_unique<MKL_INT[]>(64);
pardisoinit(PD_pt.get(), &PD_mtype_int, PD_iparm_int.get());
for (auto i = 0; i < 64; i++)
     PD_iparm[i] = PD_iparm_int[i];
PD_iparm[1] = 3;
PD_iparm[34] = 1;
T zdum;
MKL_INT64 idum;
MKL_INT64 error = 0;
MKL_INT64 n = I;
MKL_INT64 phase = 11;
PARDISO_64(PD_pt.get(), &PD_maxfct, &PD_mnum, &PD_mtype, &phase, &n, data.get(), first_element.get(), MKL_cols.get(), &idum, &PD_nrhs, PD_iparm.get(),
&PD_msglvl, &zdum, &zdum, &error);
if (error != 0)
    throw std::runtime_error("Error during symbolic factorization");

 

0 Kudos
rbunger
New Contributor I
7,144 Views

Hi,

for some unknown reason my answer from yesterday got lost...

I have a PARDISO interface in my sparse matrix class. The sparse matrix is used at different stages of the simulation, with different matrices. The code is MPI-parallelized (on Python level). So it is quite difficult for me to fill the template... All I can say is that it worked perfectly well until and including version 2024.2.1. It crashes with 2025.1.0 and 2025.1.1. The problem is not fully reproducible.

My code to perform the symbolic factorization looks like follows:

---

MKL_Create_CSR_ColsAndRows(deallocate_own_format); // converts my sparse format into oneAPI format
PD_pt = std::make_unique<MKL_INT64[]>(64);
PD_iparm = std::make_unique<MKL_INT64[]>(64);
MKL_INT PD_mtype_int = PD_mtype;
std::unique_ptr<MKL_INT[]> PD_iparm_int = std::make_unique<MKL_INT[]>(64);
pardisoinit(PD_pt.get(), &PD_mtype_int, PD_iparm_int.get());
for (auto i = 0; i < 64; i++)
    PD_iparm[i] = PD_iparm_int[i];
PD_iparm[1] = 3;
PD_iparm[34] = 1;
T zdum;
MKL_INT64 idum;
MKL_INT64 error = 0;
MKL_INT64 n = I;
MKL_INT64 phase = 11;
PARDISO_64(PD_pt.get(), &PD_maxfct, &PD_mnum, &PD_mtype, &phase, &n, data.get(), first_element.get(), MKL_cols.get(), &idum, &PD_nrhs, PD_iparm.get(),
&PD_msglvl, &zdum, &zdum, &error);
if (error != 0)
    throw std::runtime_error("Error during symbolic factorization");

---

Thanks for all your efforts and the interest in my problem,

Rainer

0 Kudos
c_sim
Employee
7,141 Views

Hi,

 

Recently we caught a new bug (Re: Pardiso regression in 2025.1.0 - Intel Community) in PARDISO, maybe it is related.

Could you check if any of the two workarounds mentioned there fixes your problem.

 

Thank you,

Chris

rbunger
New Contributor I
7,031 Views

Sorry, I have not been able to check it up to now (other tasks, cluster occupied)...

0 Kudos
rbunger
New Contributor I
6,991 Views

I haven't touched iparm[23] after calling "pardisoinit" so I assume it is set to use the default factorization. When I start the job with only one thread (OMP_NUM_THREADS=1) the job still crashes with a sig11 (which is workaround 2) in the linked thread). I haven't tried workaround 1) as I don't exactly know how to use the non-standard factorization.

0 Kudos
c_sim
Employee
6,965 Views

Yes, with one thread, this problem should have been avoided if the issue was the same. Therefore, the issue is different.

Without the matrix, it is not possible to determine the reason. Could you please send us the CSR matrix data structures that you pass as input to the first PARDISO_64 call? Specifically, dump the 'ia', 'ja', and 'a' arrays into separate files. If you have multiple MPI processes running PARDISO simultaneously, name the files according to the MPI process and provide us with the file that is dumped by the process which encounters the error. If you need further assistance in obtaining the files, please feel free to write here.

 

Kind Regards,

Chris

rbunger
New Contributor I
6,961 Views

Yes I understand...

The problem is that the crash occurs sporadically. Yesterday I have started these jobs parallel to my regular work in order to reproduce the problem. I first thought that setting it to use only one thread really solved the problem but after about ten starts I saw a crash. And this crash occurs after multiple calls to PARDISO. Additionally, these files will be very big... I will see what I can do...

 

0 Kudos
Aleksandra_K
Moderator
6,093 Views

Hi, we have not heard from you for a while. Could you comment if the issue is still relevant to you are if you are going to send the matrix?  


Regards,

Alex


rbunger
New Contributor I
6,068 Views

Hi,

 

many thanks for your interest in the topic and sorry for not coming back to the topic earlier. Other topics had higher priorities... I came back to the present topic with PARDISO just yesterday. After switching to oneAPI 2025.3, I immediately encountered a crash of PARDISO. As already mentioned earlier, 2024.2.1 is rock-solid, in contrast. I have already the option to write "a", "ia" and "ja" txt files in the code. File size of the "a" file is at least 13 GB, for a case which sometimes crashes ("ja" file size is similar, naturally). It takes forever to write these files. The problem is not exactly reproducible but I hope to have matrix data of a critical (crashing) case available this week, latest. The problem is that I have multiple calls to PARDISO before seeing a crash. I need to write the matrix before calling PARDISO. So I need to write tons of data before seeing a crash...

 

Can I upload files larger than 10 GB?

 

Thanks & kind regards

 

Rainer  

0 Kudos
rbunger
New Contributor I
5,670 Views
0 Kudos
rbunger
New Contributor I
5,409 Views

Hi,

 

I have restarted my test suite in order to collect matrix data (in case of a PARDISO crashes). The test suite is still running but up to now all crashes happened with symmetric matrices (mtype=6, although I also have many big mtype=13 cases). Typically all these cases are big but I just saw a Sig11 with a small symmetric "real" matrix (still mtype=6 but imag. part=0, special case).

 

As mentioned, the crashes are not reproducible. I'm also not absolutely sure if the crashes appear in symbolic or numerical factorization (output buffering).

 

I will upload the small "real" case. All other cases are complex and at least 50 MB in size, after compression. Cases which crash quite frequently have several GB.

 

From my observation, very big cases have a high probability to crash whereas small cases basically never crash. The uploaded case has never crashed before, so I think it is not a good candidate for inspection. But it is the only one I can upload here.

 

Thanks & kind regards

 

Rainer

 

 

0 Kudos
rbunger
New Contributor I
5,397 Views

... and another very small case where a crash is very unlikely (although it has just crashed)...

 

complex symmetric.

 

 

0 Kudos
Reply