Community
cancel
Showing results for 
Search instead for 
Did you mean: 
gandalf85p
Beginner
76 Views

running pardiso in parallel

When I run PARDISO on a 4-processor node (that is part of a cluster), the solve stage (phase=33) takes the same amount of time as it would on 1 processor. In the code itself, I have these statements:

mkl_set_num_threads(4);

mkl_set_dynamic(false);

When I SSH into that node, and run the "top" command, the CPU usage isclose to 400%. If I use 1 processor, the CPU usage is close to 100%, but it takes the same amount of time. If I use 2 nodes, the solution is twice as fast. My matrix type is 6, and I'm compiling with these libraries:

-lmkl_intel_lp64 -lmkl_core -lmkl_intel_thread -lguide -lmkl_solver

Why isn't PARDISO any faster with 4 processors? I'm guessing there's some setting or something I've missed. Thanks.

0 Kudos
4 Replies
TimP
Black Belt
76 Views

Particularly if your machine is HyperThreaded, it may be better to let MKL choose the number of threads. Depending on the type of machine, KMP_AFFINITY settings may be useful.
Gennady_F_Intel
Moderator
76 Views

this is because of this stage of calculation (phase 33) is not threaded.
More precisely, this stage of thesolution is threadedonly for the case of many right-hand sides.
--Gennady
Gennady_F_Intel
Moderator
76 Views

KMP_AFFINITYdoesn significantly affect the performance of the solver.Although every time to consider each particular case.
--Gennady
gandalf85p
Beginner
76 Views

OK, I see. So KMP_AFFINITY doesn't affect the solve stage? Not sure what you mean by "Although every time to consider each particular case."
Reply