Matrix Inversion and matrix-vector multiplication or solve linear equation for simulation

Shandler__Darren · ‎04-29-2020

Hi,

I have seen multiple times that matrix inversion is not recommended when solving linear equations and everyone says to just use a solver but I may reduce execution time significantly by inverting instead (or will I?).

I have a simulation where either I will calculate the inverse of a sparse symmetric matrix at the beginning and for each time-step calculate the matrix-(new vector) multiplication to solve the system,

or

I could just use the original sparse matrix and solve the linear system at each time step even though the matrix doesn't change.

My matrix-vector has: n ~= 20,000 and simulation has approx 10^7 time steps. So what is the optimum method?

I found pardiso to solve the system unless someone has a better recommendation.

Using MKL 2016.4 on a cluster so I could request more CPUs but my code isn't parallelized.

Thanks for your time

Kirill_V_Intel · ‎04-30-2020

Hello,

Conventional knowledge about the linear solvers says that you don't need to compute the inverse matrixexplicitly but rather can compute the matrix factorization (in trinagular factors ~ LDU and similar, or as QR) and then use the factors for solving the system. And you don't need to do it everytime if your matrix does not change and you want to solve multiple systems (with different right hand sides).

In MKL, as you've found out, you can use sparse direct solver functionality: PARDISO (or Cluster Sparse Solver to use multiple computational nodes) or Sparse QR (if your system is badly conditioned and you want more stability in the solution).

For example, using PARDISO you can factor the matrix once (call phases 11 and 22, or phase 12) and the call phase 33 (solving) for each new rhs. Also, if you happen to know all rhs (I doubt this is your case in the time-dependent problembut just in case), you cancallsolve with multiple rhs (provided as a dense matrix) at once. PARDISO has a lot of advanced features which can improve performance in certain cases.
Based on what MKL threading library you use, PARDISO will work in parallel within one computational server using OpenMP or TBB if you set up the environment correctly.

As a side note, the size of your sistem is tiny. It can be borderline between dense/sparse system solversin terms ofr performance. Is it because you're solving 1D or 2D time-dependent PDE?

Also, we've done a lot of improvements to MKL since 2016, so you might want to consider using a newer version of it.

Best,
Kirill

Shandler__Darren · ‎05-01-2020

Hi Kirill,

Thank you for responding. I wouldn't mind using a newer versions of MKL however, this is the current version on the compute units which I have access to and I don't think IT support would appreciate me trying to upgrade it.

Yes, it is only 2D at this point. The entire simulation will be solving Poisson's equation, along with three coupled PDE's using FEM (they are grouped in one matrix though) at each time step. The PDE's matrix will be changing through the simulation based on changing boundary conditions but Poisson's matrix is the constant one. I am trying to limit the size due to the time steps but once I know how long one step takes, I will increase/decrease mesh spacing.

That is great! Do I need to specify anything else when using phase 11/22/12 in order to save the factorization to reuse later. From having a read of the doc's (at least for 2016.4), I see that I will need iparm[4] = 2, iparm[30] = 0 and iparm[35] = 0 to return the permutation matrix to perm and then when solving 33 for Poisson, input the original perm and set iparm[4] = 1 (or 2 for parallel). I'm not sure if iparm[30] should be 1/2/3 and I think iparm[35] = 0. As I've never used MKL before, I'm a bit confused by a few of the parameters of iparm.

I was also wondering if there is a standard function in MKL to convert dense matrix to CSR3 or if it must be done manually?

(Edit: I have just generated it manually)

Thanks,

Darren

Kirill_V_Intel · ‎05-02-2020

Hi Darren!

If it ever happens to be important, you can always try newer MKL locally, without upgrading the system-wide installation. For example, in the simplect case you can download MKL libraries and just build your application with statically linked MKL and then run it on the cluster.

A side note: if you're solving a Poisson equation on a rectangular grid and the coefficientcs are constant, you can do better than just solving the system with a direct solver, namely you can use FFT-based techniques and MKL provides such functionality for solving the Poisson equaton (see https://software.intel.com/en-us/mkl-developer-reference-c-fast-poisson-solver-routines).

As for the usage of PARDISO, from what I got so far, you don't need to do anything special. I suggest you take a look at the example solverc/pardiso_sym_c.
You will need to do phases 11 and 22 once, before the loop over time steps. Information about the computed factorization will be stored in the opaque handle pt. Then for each time step you can just call phase = 33 for the new righthand side vector and process the solution as you need.
Indeed, the API of PARDISO has a lot of parameters but most of them usually can use the default values (0). So, what you typically do is initialize iparm with 0 and set only those ~10 entries which matter in your case.

Unless you want to force PARDISO use your own permutation (an this would be a not recommended option since it limits capability of PARDISO to get better performance and numerical stability), you don't need to care about it. PARDISO will compute a permutation internally and account for it during the solving phase automatically. So, as in the mentioned example, you can ignore the perm parameter completely.

iparm[30] is for enabling a partial solve when you don't want to use full rhs or compute full solution. I didn't see such a need in your description. So, I believe you should simply put 0 there.

As for the conversion from dense to CSR, there is a function mkl_?dnscsr in the older Sparse BLAS APIs but it is deprecated now and I strongly recommend implementing such a routine on your side. It is pretty straightforward and you can control the tolerance of when an entry in the dense matrix should be treated as zero. There is not much to be gained by using this deprecated routine.
I have a question though: why do you need such a conversion? If you're using a discretization method for th Poisson equation with some locality (e.g., finite elements), typically you should use an assembly process which creates a sparse matrix from the start. Most existing packages for FEM or other methods do that.

Best,
Kirill

Shandler__Darren · ‎05-06-2020

Hi Kirill,

Thanks. I see, I may test the differences at a later date. Unfortunately, my grid is anything but rectangular, the outer geometry is not rectangular and the elements are triangles of varying area. But I will keep that in mind for other problems if I need.

The example did help a lot and I have got PARDISO running for an SPD matrix. I did define the conversion shortly after asking that. I am implementing everything myself in an algorithm. At first I made the matrix from Poisson's equation at its correct size and then I converted to sparse after but I will change it in the future to define the sparse matrix only as memory issues appear when they get bigger. I will just need to figure out how to allocate the correct amount of memory and calculate the correct number of values for the value and column sparse vectors as the row vector is just num_nodes+1.

I have noticed that when I run with iparm[0] = 0, it reverts back to the default even if change values after, eg. iparm[34] = 1 as I am using 0 based indexing. I know this because I get an error saying

*** Error in PARDISO ( reordering_phase) error_num= -180
*** error PARDISO: reordering, symbolic factorization

and then it says "1-based array indexing is on" and I get ERROR during symbolic factorization: -3

So I have to turn iparm[0]=1 and define what I think will be good, mostly defaults but a few non-default.

But at least I now have the algorithm running and not printing any errors.

Thanks,

Darren

Kirill_V_Intel · ‎05-07-2020

Hi Darren,

The behavior of iparm[0] = 0/1 is as documented, it switches on/off default settings (which might not work for your input).

First guess about not having a correct solution is that an error happened in PARDISO calls. Since the functionality is complex, it's a good practice to check error codes returned by PARDISO calls.
Also, you can set msglvl = 1 to get more verbose output from PARDISO which might contain a hint about what happened.

If the error is zero, I suggest you share with us your iparm settingsand a code snippet which show how exactly you call pardiso.

Best,
Kirill

Shandler__Darren1 · ‎05-07-2020

Edited: Removed

Shandler__Darren1 · ‎05-11-2020

Hi Kirill,

Thank you for all your help. I've managed to get everything working well.

Thanks,

Darren