Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

MPI -rr (round robin) and perhost settings with machinefile on windows MPI 4

mcapogreco
Beginner
984 Views
Hi,

Im trying to best setup my mpiexec command so that I can do mpi call

1. Run mpiexec against a hostfile so that it only runs once on each host in a machinefile, for data sending purposes

2. once 1 is completed run mpiexec against the same hostfile using up to the max number of cores on each host for calculation purposes.

I see the -rr and -perhost is not working on the windows mpiexec.

Also, if possible I would like to combine this 2 exes into one mpiexec call.

Cheers

Mark
0 Kudos
5 Replies
James_T_Intel
Moderator
984 Views
Hi Mark,

The options -rr and -perhost are options for the Linux* version of the Intel MPI Library only. In Windows*, there are other options which will work.

Using -machinefile will specify a file with a list of hosts that will be used for the job. This will automatically use a round-robin approach. You can also specify the number of processes per host by appending ":" to the hostname. From the Reference Manual:

[plain]host1 host1 host2 host2 host3 is equivalent to: host1:2 host2:2 host3[/plain]
You can also use -configfile to use multiple option sets for each host in the job. Something such as:

[plain]-host host1 -n 1 program.exe -host host2 -n 1 program.exe[/plain]
As to how to combine your two jobs, that really depends on the jobs. What exactly do you mean by "data sending purposes" in the first step?

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
mcapogreco
Beginner
984 Views

Hi James,

Thanks for the feedback.

Basically I need to do a round robin (-perhost 1) approach on all hosts in the machinefile so that I can transfer the data to each PC only once, and once this is finished I run another mpiexec exe so that each host can have the MAX processes running on it doing a calculation addressing the data sent from the previous mpiexec.

I would like to do this with both mpiexec's addressing the same machinefile for consistency reasons.

I guess the main issue is running the first mpiexec only once per host when the machinefile is setup like

host1:MAX

host2:MAX

host3:MAX

Thanks for help.

Cheers

MArk

0 Kudos
James_T_Intel
Moderator
984 Views
Hi Mark,

Would it be possible to have the data only reside on one computer, and have a single rank read the data and then transfer it directly to each of the processes? This would skip the first run entirely.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
mcapogreco
Beginner
984 Views
Hi James,

This is how I setup my calculation initially and worked well except when the amount of data got large and we were using machines over a WAN. Ideally I just want 1 data push to each host which reduces traffic and a bottleneck on the Network Card.

Trying to find a smart way to do this only using the one hostfile, which will provide consistency and guaranty that every host used for calculation already has data pushed into shared memory on the host.

Thanks for your help.

Cheers

MArk
0 Kudos
James_T_Intel
Moderator
984 Views
Hi Mark,

I tested this today, you can use the host:nprocs form and specify only one process for each host. If you specify more processes than are available, the next process will go back to the first host on the list, similar to the -rr option in Linux*.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
0 Kudos
Reply