Intel MKL 11 library offers optimized set of threaded functions, but for case of iterative sparse solver (ISS), the preconditioned conjugate gradient method does not seem to be straightforward to be threaded.
To be more precise, using preconditioning techniques such as incomplete Cholesky factorization or ILU, at some point sparse triangular solvers are required, but corresponding MKL function to perform triangular solving mkl_cspblas_?csrtrsv is not threaded.
I'd like to know if dcg is threaded, and if there is any workaround to achieve better performance in iterative solvers on multi-processors? Should I expect threaded ISS in a near future?
dcg is not threaded and we cannot share the info regard to our future plans. for achiving better performance you can use the sparse mv and triangular solvers routines from BLAS 2-3 levels