Parallel iterative solver (CG or FGMRES)

bryce155 · ‎11-07-2012

I have a Incomplete Cholesky preconditioner and run the CG using RCI communication and it behaved very poor. THere is only a very little improvement from sequential and parallel mode. It took 80 secs for parallel and 86 for sequential. I am using intel Xeon X5650 2.67. Is it normal for iterative solver? I used the latest MKL 11. It scaled almost linear with direct solver (Pardiso)

Thanks,

Bryce

Alexander_K_Intel2 · ‎11-07-2012

Hi, CG is RCI interface that doesn't affect performance of whole algorithm. Does your implementation of multiplication on stiffness matrix and precondition parallel or not? With best regards, Alexander Kalinkin

bryce155 · ‎11-07-2012

Hi Alex, Thanks for the prompt response. I used 2 calls of mkl_dcsrtrsv in the preconditioner solve RCI=3 and mkl_dcsrsymv for matrix multiplication. Does it mean that those function dont perform well in parallel? Best regards, Bryce

Gennady_F_Intel · ‎11-08-2012

Bryce, yes, that's may be the problem: level 2 Sparse Triangular solvers (mkl_dcsrtrsv) is not threaded, but computing of m-v product of a sparse symmetrical matrix (mkl_dcsrsymv () ) is threaded. --Gennady

bryce155 · ‎11-08-2012

Ok. Just one more question. Do you plan to include any parallel preconditioner for the iterative solver such as Block Jacobi (or block incomplete cholesky) or multigrid,etc?

Gennady_F_Intel · ‎11-08-2012

the only one thing I can say, that there are no such plans in the nearest release of MKL.

bryce155 · ‎11-12-2012

Thank you Fedorov. I am still wondering the backward and forward substitutions in Pardiso were already parallel. Can we expect the same thing for the triangular solver in the near future? Thanks,

Gennady_F_Intel · ‎11-12-2012

Hello, there are no such plans in the nearest future. Gennady

yanpu_z_ · ‎04-18-2013

I also noticed that when linking the parallel MKL libraries, the backward and forward substitutions (Ax=L*U*x=b<==>L*y=b, U*x=y) is almost the same as its sequential versions. Although the CPU usage is close to 100%, the speed for solving the equation A*x=b is not accelerated at all.

I also expect the triangular solver can be parallized in the near future.

Thanks very much!