I am recently solving large scale sparse linear equations. To improve the efficiency, I tried to call pardiso subroutines to solve different equations (different matrices and right hands) in parallel. But it seems the threads calling pardiso subroutines independently mess up with pardiso's internal structure so that the final answers are completely wrong. Could you think of any possible strategy to correctly use pardiso in parallel?
Just notice your question is still open. So share some thoughts.
You may have known, Pardiso itself is threaded internally in MKL library at early version and enabled if link mkl_intel_thread.lib library. So from the point of view of high efficiency on multi-core( I assume you are runing the application on multi-core machine, if cluster, you may consider single node with multi-core cpu), we will suggest you to use the Pardiso intenal paralell directly.
for example, link with option Qmkl:parallel if with Intel Compiler.Then youwill see the pardiso are running in parallel.(allcores usage runing almost 100%, andto addexternal parallelmay causeoverhead)
If you perfer to implement the parallel upon the pardiso,in general,link Qmkl:sequential is fine (no multi-treaded issue) . Iflink with Qmkl:parallelwithcustomer parallel method, it should be ok too because
Intel MKL is thread-safe, which means that all Intel MKL functions work correctly during simultaneous execution by multiple threads. In particular, any chunk of threaded Intel MKL code provides access for multiple threads to the same shared data, while permitting only one thread at any given time to access a shared piece of data. Therefore, you can call Intel MKL from multiple threads and not worry about the function instances interfering with each other.
But as you mentioned , "But it seems the threads calling pardiso subroutines independently mess up with pardiso's internal structure so that the final answers are completely wrong." You may need to take care of all sharedvariables (parameter)betweenimplemented threadsby yourself . For example, give local variable for each implemented threads. If it is still issue, could you please tellhowyou implement the threads calling pardiso so get wrong result?