Pardiso Optimization

TD5 · ‎12-11-2018

Hello,

i am currently solving a relatively small system of equations (3000-10000 equations), but since i have to solve it millions of times, each time with slightly different system matrices and right hand sides, it still takes a lot of time overall. So now i am trying to shave a few more ms off. My system matrix is real, not symmetric and contains about five nonzero entries per row. So now i am wondering how to optimize the problem further. I already did the following: Call phase=13 only for the first solve, then swith to phase=23 and reuse pt. This saved about 30% calculation time.

Iparm(24)=10 also saved some time, but unfortunately only in Debug mode.
Iparm(2)=0 shaved a few ms off as well.

I Also have a few questions: Is structurally symmetric == symmetric? I thought structurally symmetric meant that the pattern of nonzero entries is symmetric, which should be the case for my matrix, but using mtype=1 resulted in an access violation in pardiso.

My system matrix does not change very much from iteration to iteration. About half the equations stay the same, i can also predict quite well which equations stay the same. Is there any way to abuse that to speed up the solution?

Also, are there maybe other solvers that may be better suited for my problem?

The function is down below, an example matrix in csr format can be found here: http://s000.tinyupload.com/?file_id=00837828014833851584
Any help is very appreciated! Even the smallest gain in execution speed will help me a great deal.

SUBROUTINE QOED_SOLVE_PARDISO_S(A,ia,ja,pt,schleife, ZUST, GIT, SYS)
    USE MODULE_QOED_SUB

    IMPLICIT NONE
    TYPE(GITTER(GIT%NX,GIT%NY))                                     ,INTENT(IN)             :: GIT
    INTEGER             , DIMENSION(GIT%NX*GIT%NY+1)                ,INTENT(IN)             :: ia
    INTEGER             , DIMENSION(GIT%NX*(GIT%NY-2)*5+2*GIT%NX)   ,INTENT(IN)             :: ja
    REAL(KIND=8)        , DIMENSION(GIT%NX*(GIT%NY-2)*5+2*GIT%NX)   ,INTENT(IN)             :: A
    TYPE(ZUSTAND(GIT%NX,GIT%NY))                                    ,INTENT(INOUT)          :: ZUST
    TYPE(SYSTEMMATRIX(GIT%NX,GIT%NY))                               ,INTENT(IN)             :: SYS
    INTEGER                                                         ,INTENT(IN)             :: schleife
    INTEGER                                                                                 :: maxfct,mnum
    INTEGER                                                                                 :: mtype
    INTEGER                                                                                 :: phase
    INTEGER                                                                                 :: n
    INTEGER                                                                                 :: nrhs
    INTEGER                                                                                 :: error
    INTEGER(KIND=8) , DIMENSION(64)                            , INTENT(INOUT)             :: pt
    INTEGER(KIND=8) , DIMENSION(64)                                                        :: pt_temp!Internal pardiso data storage, initialized to zero, needs to be saved for following pardiso calls or memory leaks can occur
    INTEGER                                                                                 :: msglvl !PARDISO Output, 1-> generate statisitical output
    INTEGER          , DIMENSION(64)                                                        :: IPARM
    INTEGER          , DIMENSION(GIT%NX*GIT%NY)                                             :: PERM

    MSGLVL=2
    PERM=0
    phase=23
    mtype=11
    error=0
    nrhs=1
    mnum=1
    maxfct=1
    mnum=1

    n=GIT%NX*GIT%NY
    iparm=0_8

    CALL PARDISOINIT(pt_temp,mtype,iparm)
    if (schleife == 1) then
        pt=pt_temp
        phase=13
        endif

    CALL PARDISO(pt, maxfct, mnum, mtype, phase, n, a, ia, ja, perm, nrhs, iparm, msglvl, SYS%RHS, ZUST%P0, error)

END SUBROUTINE

The picture is for the first call, subsequent calls need significantly less time for malloc.

Gennady_F_Intel · ‎12-11-2018

Thomas, I am not quite sure understand if the sparsity structure of your input matrix are the same or not? in the case of identical structure - you may try to use maxfct option.

TD5 · ‎12-11-2018

Yes the structure stays the same, just the values of the nonzero entries change slightly. I am not sure how to use maxfct to accelerate the solving, in my understanding maxfct > 1 is only useful for differently structured/dimensioned matrices, or am I mistaken?

Kirill_V_Intel · ‎12-12-2018

Hello Thomas!

There is one trick which you might want to try. If the matrix is changing ever so slightly, you can try to do the factorization only once and then call pardiso with phase 33 only. The idea is that since the matrices are close, the system will be first solved with some nonzero residual which will be decreased efficiently by the iterative refinement. If matrices are close, then iterative refinement will make only a few iterations, each of them is cheap. However, if the matrix gets too far from the factored one, then the # of iterations can start growing. So you could also define some kind of a measure for "how large is the difference"between the current matrix and the factored one and re-do factorization from time to time while processing all your matrices.

I hope this can help.

Thanks,

Kirill