Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2174 Discussions

wrong job dispatching on cpu usinig IntelMPI2.0

martialp
Beginner
1,001 Views
When I run a job with IntelMPI2.0 using a file referencing two machines with 4 cpu each. I can see that on the first machine only two jobs are running and on the second machine this is 6 jobs instead of 4, one the first machine and 4 on the other one. Can you explain me a reason for this behaviour ?
0 Kudos
4 Replies
Gergana_S_Intel
Employee
1,001 Views

Hi martialp,

If I understand correctly, you're simply trying to run 2 jobs, 4 MPI processes each, on 2 different machines - is that true?

Could you provide us with how you run your application (mpdboot/mpiexec command line, or mpirun, if you use that), as well as any mpd.hosts files, machine files, or config files, you might use. At this point, we need a bit more information to make a suggestion.

Thanks,
~Gergana

0 Kudos
martialp
Beginner
1,001 Views
Quoting - martialp
When I run a job with IntelMPI2.0 using a file referencing two machines with 4 cpu each. I can see that on the first machine only two jobs are running and on the second machine this is 6 jobs instead of 4, one the first machine and 4 on the other one. Can you explain me a reason for this behaviour ?

What I try to do is to run an application using 8 cpus (4 on one machine and 4 on the other one). The command line is the following: mpirun -f host.list -np8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe
The host.list file contains 2 lines:
lnx_137_1e051:4
lnx_137_1e033:4
0 Kudos
Gergana_S_Intel
Employee
1,001 Views
Quoting - martialp
What I try to do is to run an application using 8 cpus (4 on one machine and 4 on the other one). The command line is the following: mpirun -f host.list -np8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe
The host.list file contains 2 lines:
lnx_137_1e051:4
lnx_137_1e033:4

Hi martialp,

The -f option in this case would only read the names of the hosts you want to use. What you can do here is use the -perhost option, which helps you indicate how many processes the Intel MPI Library should put on each node. Your command line will look like this:

$ mpirun -f host.list -perhost 4 -np 8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe

Let us know how this goes.

Regards,
~Gergana

0 Kudos
martialp
Beginner
1,001 Views

Hi martialp,

The -f option in this case would only read the names of the hosts you want to use. What you can do here is use the -perhost option, which helps you indicate how many processes the Intel MPI Library should put on each node. Your command line will look like this:

$ mpirun -f host.list -perhost 4 -np 8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe

Let us know how this goes.

Regards,
~Gergana

Hello Gergana

My client has used this new paramater with success. Thank you very much for this suggestion. We just have to find now why the jobs are not equally dispatch on the cpu of the first node of the cluster (some people suggest me thatit can be related to some cluster configuration file for the thread management)
Once again thanks a lot.

Martial
0 Kudos
Reply