pardiso with parallel MKL and TBB on large systems

Goldis__John · ‎01-26-2018

Hi,

I'm solving a large sparse system with pardiso. I'm using the Parallel MKL library with TBB enabled on Windows 64bit system. My A matrix is 57k x 57k and my right hand side B matrix is 57k x 45k.

When I run the code, I get an error during the solution = -2. If I switch to using the Sequential version of MKL, the peak memory used is about 80GB but the process solves. In the parallel mode, I get the output regarding the statistics of the parallel direct factorization and the error happens almost immediately so it doesn't seem like it's actually trying to do the forward/backward substitution.

It seems that pardiso is just trying to use up all available memory in parallel mode. I have tried on machines with 64GB memory, 72GB and 128GB and each time, if I check on the value: max(iparm[14], iparm[15] + iparm[16]) before the solve phase (I'm using C indexing) the results is always returned as the remaining free memory on the machine.

Gennady_F_Intel · ‎01-27-2018

What version of lib do you use? Is that specific with TBB based threading mode or the same with OpenMP mode too?

Goldis__John · ‎01-29-2018

I'm using the Build date: 08 Nov 2017 version. I tried with and without TBB and am getting the same result. The factorization takes about 20 GB of memory and then i immediately get the error. The machine has 140GB of RAM (it's an AWS c5.18xlarge instance). Here are my solver settings and I've tried with iparm[1] set to both 2 and 3 as well as iparm[24] at both 0 and 1 (although probably missed some combinations with TBB on/off).

iparm[0] = 1;

iparm[1] = 2;

iparm[3] = 0;

iparm[4] = 0;

iparm[5] = 0;

iparm[7] = 2;

iparm[9] = 8;

iparm[10] = 1;

iparm[12] = 0;

iparm[13] = 0;

iparm[17] = -1;

iparm[18] = -1;

iparm[19] = 0;

iparm[20] = 1;

iparm[24] = 1;

iparm[34] = 1;

thanks!