I have used FGMRES from mkl recently and since mkl doesn't support parallelization of FGMRES on multiple processors (only multithreading!), I would like to try my hands on Scalapack, which is able to solve the linear system of equations in a direct sense on multiple processors.
I was studying p?getrs and found the mention of a distributed matrix. Physically this makes sense to me as I can imagine that the actual matrix will be split into various sub matrices. Each of these would then be solved in parallel on multiple processors. BUT there's some information missing or perhaps I missed it:(?)
1. Should the user provide the sub matrices to all processors or will mkl scalapack do this automatically? If the user should provide this, what is the format and on what basis the partitioning must be performed. Will it be okay to use a library like ParMETIS to do this? If not, then does that mean the code runs sequentially to start with and then broadcasts the sub matrices to respective processors?
2. Also, I expect some communication between processors when the sub matrices are solved in parallel. There is no mention of this either.
3. In the examples folder, I was unable to locate codes which give the actual calling sequence of sub routines for scalapack like for other solvers like FGMRES. Is this because this is rather trivial and needs just a factorization call followed by the call to the linear solver?
Hi Amar, thanks for your questions. Many of them would require quite a bit of writeup, so it occurred to me that this must have been written up somewhere. I have done some research for you on past questions/answers pertaining to this specific subject.
First things first, I noticed that there was a forum post a few years ago pertaining to the essentials of Scalapack
Second, on your question on the distributed matrices, I found some information here.
As far as how Intel MKL specifically handles the distributed matrices? I will defer to Ivan, our resident scalapack expert. I will ask him to take a look at your question and he will respond here ASAP. I hope this is enough to sink your teeth into until then.
The examples of calling sequence of ScaLAPACK subroutines p?getrf and p?getrs can be found in the tests folder: __release_lnx/mkl/tests/scalapack/source/LIN. Please take a look at pdludriver.f as an example for the for the double precision ScaLAPACK LU routines.
All communications in ScaLAPACK are done with the help of BLACS routines. The BLACS routines are used to support a linear algebra oriented message passing interface.
All the best