- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I have a general square dense matrix A (not symmetric) which is formed by A=P^{T}BP where B was in a compressed storage scheme and P is a rectangular matrix. The size of A ranges from 10x10 to 500x500, where B can be 150,000x150,000 and is sparse.

What would be the best way to solve for x given b (system of linear equations) that result from

Ax=b => x=A^{-1}b

Right now I am just using LAPACK DGESV that is linked to MKL (so assume I am using their solver). Is there any benifit to going to a interative solver or any recomendations as to how to best solve this system of equations as fast as possible.

Thanks for any comments

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

**A**...

**500x500**, where

**B**can be

**150,000x150,000**... How long does it take to solve it on your computer? Thanks in advance. Note: I see that there are two threads already, one is in MKL forum and another is in Intel Visual Fortran forum...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Someone had suggested after I posted on Intel Fortrnal that I post my question on here since I am using the MKL library to solve the LAPACK routines.

It only takes a few seconds, but for each solution of A creates a new version of B and which is then matrix multiplied by P to build a new version of A which then needs a new solution. I like to speed up, even by a fraction of a second, solving the system of equations. There also is of course a slow down do to the A=P^{T}BP, but I am unsure if there is anything faster than using DGEMM.

It is a particular program where time is important, even for a few extra milliseconds.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

**a few seconds**... Is it for

**B**when it has dimensions

**150000x150000**? Note 1: In case of a single-precision 84GB of memory is needed for

**B**Note 2: In case of a double-precision 168GB of memory is needed for

**B**PS: Of course it is possible if a Cray-like supercomputer is used...

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

B is formed as a result of finite differences, so its stored in a band like structure/vector to minimize storage then is transformed from the pre and post multiplication of P. Actually what I will post another time is how is it best to multiply out P^{T}BP

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page