Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6621 Discussions

Parallel iterative solver (CG or FGMRES)

bryce155
Beginner
301 Views

I have a Incomplete Cholesky preconditioner and run the CG using RCI communication and it behaved very poor. THere is only a very little improvement from sequential and parallel mode. It took 80 secs for parallel and 86 for sequential. I am using intel Xeon X5650 2.67. Is it normal for iterative solver? I used the latest MKL 11. It scaled almost linear with direct solver (Pardiso)

Thanks,

Bryce

0 Kudos
8 Replies
Alexander_K_Intel2
301 Views
Hi, CG is RCI interface that doesn't affect performance of whole algorithm. Does your implementation of multiplication on stiffness matrix and precondition parallel or not? With best regards, Alexander Kalinkin
bryce155
Beginner
301 Views
Hi Alex, Thanks for the prompt response. I used 2 calls of mkl_dcsrtrsv in the preconditioner solve RCI=3 and mkl_dcsrsymv for matrix multiplication. Does it mean that those function dont perform well in parallel? Best regards, Bryce
Gennady_F_Intel
Moderator
301 Views
Bryce, yes, that's may be the problem: level 2 Sparse Triangular solvers (mkl_dcsrtrsv) is not threaded, but computing of m-v product of a sparse symmetrical matrix (mkl_dcsrsymv () ) is threaded. --Gennady
bryce155
Beginner
301 Views
Ok. Just one more question. Do you plan to include any parallel preconditioner for the iterative solver such as Block Jacobi (or block incomplete cholesky) or multigrid,etc?
Gennady_F_Intel
Moderator
301 Views
the only one thing I can say, that there are no such plans in the nearest release of MKL.
bryce155
Beginner
301 Views
Thank you Fedorov. I am still wondering the backward and forward substitutions in Pardiso were already parallel. Can we expect the same thing for the triangular solver in the near future? Thanks,
Gennady_F_Intel
Moderator
301 Views
Hello, there are no such plans in the nearest future. Gennady
yanpu_z_
Beginner
301 Views

I also noticed that when linking the parallel MKL libraries, the backward and forward substitutions (Ax=L*U*x=b<==>L*y=b, U*x=y) is almost the same as its sequential versions. Although the CPU usage is close to 100%, the speed for solving the equation A*x=b is not accelerated at all.

I also expect the triangular solver can be parallized in the near future.

Thanks very much!

Reply