Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

running pardiso in parallel

gandalf85p
Beginner
365 Views
When I run PARDISO on a 4-processor node (that is part of a cluster), the solve stage (phase=33) takes the same amount of time as it would on 1 processor. In the code itself, I have these statements:

mkl_set_num_threads(4);

mkl_set_dynamic(false);

When I SSH into that node, and run the "top" command, the CPU usage isclose to 400%. If I use 1 processor, the CPU usage is close to 100%, but it takes the same amount of time. If I use 2 nodes, the solution is twice as fast. My matrix type is 6, and I'm compiling with these libraries:

-lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -lguide -lmkl_solver

Why isn't PARDISO any faster with 4 processors? I'm guessing there's some setting or something I've missed. Thanks.

0 Kudos
4 Replies
TimP
Honored Contributor III
365 Views
Particularly if your machine is HyperThreaded, it may be better to let MKL choose the number of threads. Depending on the type of machine, KMP_AFFINITY settings may be useful.
0 Kudos
Gennady_F_Intel
Moderator
365 Views
this is because of this stage of calculation (phase 33) is not threaded.
More precisely, this stage of thesolution is threadedonly for the case of many right-hand sides.
--Gennady
0 Kudos
Gennady_F_Intel
Moderator
365 Views
KMP_AFFINITYdoesn significantly affect the performance of the solver.Although every time to consider each particular case.
--Gennady
0 Kudos
gandalf85p
Beginner
365 Views
OK, I see. So KMP_AFFINITY doesn't affect the solve stage? Not sure what you mean by "Although every time to consider each particular case."
0 Kudos
Reply