Community
cancel
Showing results for 
Search instead for 
Did you mean: 
martialp
Beginner
55 Views

wrong job dispatching on cpu usinig IntelMPI2.0

When I run a job with IntelMPI2.0 using a file referencing two machines with 4 cpu each. I can see that on the first machine only two jobs are running and on the second machine this is 6 jobs instead of 4, one the first machine and 4 on the other one. Can you explain me a reason for this behaviour ?
0 Kudos
4 Replies
Gergana_S_Intel
Employee
55 Views

Hi martialp,

If I understand correctly, you're simply trying to run 2 jobs, 4 MPI processes each, on 2 different machines - is that true?

Could you provide us with how you run your application (mpdboot/mpiexec command line, or mpirun, if you use that), as well as any mpd.hosts files, machine files, or config files, you might use. At this point, we need a bit more information to make a suggestion.

Thanks,
~Gergana

martialp
Beginner
55 Views

Quoting - martialp
When I run a job with IntelMPI2.0 using a file referencing two machines with 4 cpu each. I can see that on the first machine only two jobs are running and on the second machine this is 6 jobs instead of 4, one the first machine and 4 on the other one. Can you explain me a reason for this behaviour ?

What I try to do is to run an application using 8 cpus (4 on one machine and 4 on the other one). The command line is the following: mpirun -f host.list -np8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe
The host.list file contains 2 lines:
lnx_137_1e051:4
lnx_137_1e033:4
Gergana_S_Intel
Employee
55 Views

Quoting - martialp
What I try to do is to run an application using 8 cpus (4 on one machine and 4 on the other one). The command line is the following: mpirun -f host.list -np8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe
The host.list file contains 2 lines:
lnx_137_1e051:4
lnx_137_1e033:4

Hi martialp,

The -f option in this case would only read the names of the hosts you want to use. What you can do here is use the -perhost option, which helps you indicate how many processes the Intel MPI Library should put on each node. Your command line will look like this:

$ mpirun -f host.list -perhost 4 -np 8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe

Let us know how this goes.

Regards,
~Gergana

martialp
Beginner
55 Views

Hi martialp,

The -f option in this case would only read the names of the hosts you want to use. What you can do here is use the -perhost option, which helps you indicate how many processes the Intel MPI Library should put on each node. Your command line will look like this:

$ mpirun -f host.list -perhost 4 -np 8 /easd/apps/vendor_appl/devl/Interwell5.3/bin/Linux/csh_presti_exe

Let us know how this goes.

Regards,
~Gergana

Hello Gergana

My client has used this new paramater with success. Thank you very much for this suggestion. We just have to find now why the jobs are not equally dispatch on the cpu of the first node of the cluster (some people suggest me thatit can be related to some cluster configuration file for the thread management)
Once again thanks a lot.

Martial
Reply