pardiso_schur.c with MKL_PARDISO, time consumption

Qigeng · ‎12-08-2021

Hi,

I met an issue when using the pardiso_schur.c

I use oneAPI 2021.04 release.

/opt/intel/oneapi/mkl/latest/examples/c/sparse_directsolvers/source/pardiso_schur.c

for a large matrix A = I, with 80000*80000 size and n_schur is 1400.

The time show in "Message level information" is different with std::chrono, the code like this:

auto t0 = std::chrono::steady_clock::now();
pardiso (pt, &maxfct, &mnum, &mtype, &phase,
         &n, a, ia, ja, perm, &nrhs, iparm, &msglvl, &ddum, &ddum, &error);
if ( error != 0 )
{
     printf ("\nERROR during symbolic factorization: " IFORMAT, error);
     exit (1);
}
auto t1 = std::chrono::steady_clock::now();
std::cout << "t1-t0 = " << std::chrono::duration_cast<std::chrono::duration<double>>(t1 - t0).count() * 1e3 << "ms   , PARDISO(phase11, symbolic factorization)" << std::endl;

and the result is:

Times:

======

Time spent in calculations of symmetric matrix portrait (fulladj): 0.001536 s

Time spent in reordering of the initial matrix (reorder) : 0.000077 s

Time spent in symbolic factorization (symbfct) : 0.006114 s

Time spent in data preparations for factorization (parlist) : 0.000451 s

Time spent in allocation of internal data structures (malloc) : 0.003068 s

Time spent in additional calculations : 0.002148 s

Total time spent : 0.013394 s

t1-t0 = 34.072ms , PARDISO(phase11, symbolic factorization)

It's about 2x between "t1-t0" and "Total time spent".

@Kirill_V_Intel

I am an employee of Intel, ping me directly if any need.

Thanks so much!

Qigeng · ‎12-08-2021

I try to make a loop of 4 times to call phase11, and find that:

1. for matrix A = I (identity matrix), with 80000*80000 size and n_schur is 1400. When the first time to call phase11, there is about 2x between "t1-t0" and "Total time spent". But for the rest, the "t1-t0" and "Total time spent" is equal.

2.However, for other matrix which is more complex than identity matrix, which means that nnz of matrix is more than 80000, no matter which time in the loop, the "t1-t0" is about 2x "Total time spent".

I think that PARDISO function maybe has a saving logic? when first time call phase11, structure of matrix will be saved, so for the rest times, the time consumption will be less; but for large matrix, the space for saving is not enough. As a result, every time, the structure of matrix will be established once.

This is only my guess. If it is not correct, please tell me.

Thanks

Gennady_F_Intel · ‎12-09-2021

Zhang,

it could be some kind of overhead problem. We reproduced the problem and will investigate the cause of this behavior. This thread will be updated with the results of this investigation.

-Gennady

Qigeng · ‎12-09-2021

Get it!

Thanks.