Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2153 Discussions

Mpiexec issue order of machines

Hob
Beginner
2,648 Views

Hi all,

I am having an issue with mpiexec, I have a bundled install with Fire Dynamics Simulator (FDS) and I am attempting to run a simple hello world script that is bundled with FDS called test_mpi link: https://github.com/firemodels/fds/blob/master/Utilities/test_mpi/test_mpi.f90

The issue I have is if I run:

 

'mpiexec -hosts 2 non-local-machine 1 local-machine 1 test_mpi'

I get the hello work with the rank, however, if I swap such that the local-machine is first, I only get the localhost machine reported with the non-local-machine never replying.

Should this be an expected result or is there an issue somewhere?

0 Kudos
6 Replies
Maksim_B_Intel
Employee
2,648 Views

That invocation is certain not to work, it's not conforming to command-line syntax. You want to run something like:

 mpiexec -n 2 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

for your case.

0 Kudos
Hob
Beginner
2,648 Views

Many thanks for this, I have tried the above with the same results (a hang waiting for a reply), ironically if I launch the above:

 

mpiexec -n 2 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

 

It hangs awaiting a localhost reply, if I then launch a separate mpi localhost only request, I get a reply in the original command window whilst the second window waits for a reply.


Would this point to a network issue?

0 Kudos
Maksim_B_Intel
Employee
2,648 Views

Could you provide output of 

mpiexec -v -n 1 -host localhost test_mpi

If you get output from remote machine, that's unlikely to be a network issue.

Are you running under Linux or Windows?

0 Kudos
Hob
Beginner
2,648 Views

Maksim B. (Intel) wrote:

Could you provide output of 

mpiexec -v -n 1 -host localhost test_mpi

If you get output from remote machine, that's unlikely to be a network issue.

Are you running under Linux or Windows?

Under Windows:


type helpfds for help on running fds
C:\Users\CFD>mpiexec -v -n 1 -host localhost test_mpi
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:0@CFD-PC11] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_maxes
[proxy:0:0@CFD-PC11] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_appnum
[proxy:0:0@CFD-PC11] PMI response: cmd=appnum appnum=0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:0@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_6920_0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:0@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_6920_0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=put kvsname=kvs_6920_0 key=OFI-0 value=OFI#0200CE2FC0A801330000000000000000$
[proxy:0:0@CFD-PC11] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_6920_0 key=OFI-0
[proxy:0:0@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CE2FC0A801330000000000000000$
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
 Hello world: rank            0  of            1  running on
 CFD-PC11


[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=finalize
[proxy:0:0@CFD-PC11] PMI response: cmd=finalize_ack

C:\Users\CFD>

 

0 Kudos
Maksim_B_Intel
Employee
2,648 Views

What is your IMPI version, that can be found with

mpiexec --version

Can you give the output of

mpiexec -v -n 1 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

Can you check that remote machine has hydra_service running?

0 Kudos
Hob
Beginner
2,648 Views

Maksim B. (Intel) wrote:

What is your IMPI version, that can be found with

mpiexec --version


type helpfds for help on running fds
C:\Users\CFD>mpiexec --version
Intel(R) MPI Library for Windows* OS, Version 2019 Build 20180829 (id: 15f5d6c0c)
Copyright 2003-2018, Intel Corporation.

C:\Users\CFD

Maksim B. (Intel) wrote:

Can you give the output of

mpiexec -v -n 1 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

Can you check that remote machine has hydra_service running?

No output is produced it just hangs,

hydra_service.exe is running on the remote pc

 

If I open a new cmd window and type mpiexec -n 1 test_mpi, the original console returns:

 

C:\Users\CFD>mpiexec -v -n 1 -host cfd-pc2 test_mpi : -n 1 -host localhost test_mpi
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:1@CFD-PC11] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_maxes
[proxy:0:1@CFD-PC11] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_appnum
[proxy:0:1@CFD-PC11] PMI response: cmd=appnum appnum=1
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=PMI_process_mapping
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,1))
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=PMI_active_process_mapping
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,0))
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=put kvsname=kvs_4312_0 key=OFI-1 value=OFI#0200CEC6C0A801330000000000000000$
[proxy:0:1@CFD-PC11] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=OFI-0
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CEC5C0A801330000000000000000$
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=OFI-1
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CEC6C0A801330000000000000000$
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
 Hello world: rank            0  of            2  running on
 CFD-PC11


 Hello world: rank            1  of            2  running on
 CFD-PC11


[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=finalize
[proxy:0:1@CFD-PC11] PMI response: cmd=finalize_ack

C:\Users\CFD>

many thanks,

 

0 Kudos
Reply