Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

## Fast Small Dense Matrix Solver Beginner
171 Views

I have a general square dense matrix A (not symmetric) which is formed by A=PTBP where B was in a compressed storage scheme and P is a rectangular matrix. The size of A ranges from 10x10 to 500x500, where B can be 150,000x150,000 and is sparse.

What would be the best way to solve for x given b (system of linear equations) that result from

Ax=b  =>  x=A-1b

Right now I am just using LAPACK DGESV that is linked to MKL (so assume I am using their solver). Is there any benifit to going to a interative solver or any recomendations as to how to best solve this system of equations as fast as possible.

Thanks for any comments

4 Replies Valued Contributor II
171 Views
Scott, I have a generic question. >>...The size of A ... 500x500, where B can be 150,000x150,000... How long does it take to solve it on your computer? Thanks in advance. Note: I see that there are two threads already, one is in MKL forum and another is in Intel Visual Fortran forum... Beginner
171 Views

Someone had suggested after I posted on Intel Fortrnal that I post my question on here since I am using the MKL library to solve the LAPACK routines.

It only takes a few seconds, but for each solution of A creates a new version of B and which is then matrix multiplied by P to build a new version of A which then needs a new solution. I like to speed up, even by a fraction of a second, solving the system of equations. There also is of course a slow down do to the A=PTBP, but I am unsure if there is anything faster than using DGEMM.

It is a particular program where time is important, even for a few extra milliseconds. Valued Contributor II
171 Views
Thanks for the details! >>...It only takes a few seconds... Is it for B when it has dimensions 150000x150000? Note 1: In case of a single-precision 84GB of memory is needed for B Note 2: In case of a double-precision 168GB of memory is needed for B PS: Of course it is possible if a Cray-like supercomputer is used... Beginner
171 Views

B is formed as a result of finite differences, so its stored in a band like structure/vector to minimize storage then is transformed from the pre and post multiplication of P. Actually what I will post another time is how is it best to multiply out PTBP 