Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Mpiexec issue order of machines

Hob
Beginner
4,178 Views

Hi all,

I am having an issue with mpiexec, I have a bundled install with Fire Dynamics Simulator (FDS) and I am attempting to run a simple hello world script that is bundled with FDS called test_mpi link: https://github.com/firemodels/fds/blob/master/Utilities/test_mpi/test_mpi.f90

The issue I have is if I run:

 

'mpiexec -hosts 2 non-local-machine 1 local-machine 1 test_mpi'

I get the hello work with the rank, however, if I swap such that the local-machine is first, I only get the localhost machine reported with the non-local-machine never replying.

Should this be an expected result or is there an issue somewhere?

0 Kudos
6 Replies
Maksim_B_Intel
Employee
4,178 Views

That invocation is certain not to work, it's not conforming to command-line syntax. You want to run something like:

 mpiexec -n 2 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

for your case.

0 Kudos
Hob
Beginner
4,178 Views

Many thanks for this, I have tried the above with the same results (a hang waiting for a reply), ironically if I launch the above:

 

mpiexec -n 2 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

 

It hangs awaiting a localhost reply, if I then launch a separate mpi localhost only request, I get a reply in the original command window whilst the second window waits for a reply.


Would this point to a network issue?

0 Kudos
Maksim_B_Intel
Employee
4,178 Views

Could you provide output of 

mpiexec -v -n 1 -host localhost test_mpi

If you get output from remote machine, that's unlikely to be a network issue.

Are you running under Linux or Windows?

0 Kudos
Hob
Beginner
4,178 Views

Maksim B. (Intel) wrote:

Could you provide output of 

mpiexec -v -n 1 -host localhost test_mpi

If you get output from remote machine, that's unlikely to be a network issue.

Are you running under Linux or Windows?

Under Windows:


type helpfds for help on running fds
C:\Users\CFD>mpiexec -v -n 1 -host localhost test_mpi
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:0@CFD-PC11] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_maxes
[proxy:0:0@CFD-PC11] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_appnum
[proxy:0:0@CFD-PC11] PMI response: cmd=appnum appnum=0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:0@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_6920_0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:0@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_6920_0
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=put kvsname=kvs_6920_0 key=OFI-0 value=OFI#0200CE2FC0A801330000000000000000$
[proxy:0:0@CFD-PC11] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_6920_0 key=OFI-0
[proxy:0:0@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CE2FC0A801330000000000000000$
[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:0@CFD-PC11] PMI response: cmd=barrier_out
 Hello world: rank            0  of            1  running on
 CFD-PC11


[proxy:0:0@CFD-PC11] pmi cmd from fd 372: cmd=finalize
[proxy:0:0@CFD-PC11] PMI response: cmd=finalize_ack

C:\Users\CFD>

 

0 Kudos
Maksim_B_Intel
Employee
4,178 Views

What is your IMPI version, that can be found with

mpiexec --version

Can you give the output of

mpiexec -v -n 1 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

Can you check that remote machine has hydra_service running?

0 Kudos
Hob
Beginner
4,178 Views

Maksim B. (Intel) wrote:

What is your IMPI version, that can be found with

mpiexec --version


type helpfds for help on running fds
C:\Users\CFD>mpiexec --version
Intel(R) MPI Library for Windows* OS, Version 2019 Build 20180829 (id: 15f5d6c0c)
Copyright 2003-2018, Intel Corporation.

C:\Users\CFD

Maksim B. (Intel) wrote:

Can you give the output of

mpiexec -v -n 1 -host non-local-machine test_mpi : -n 1 -host localhost test_mpi

Can you check that remote machine has hydra_service running?

No output is produced it just hangs,

hydra_service.exe is running on the remote pc

 

If I open a new cmd window and type mpiexec -n 1 test_mpi, the original console returns:

 

C:\Users\CFD>mpiexec -v -n 1 -host cfd-pc2 test_mpi : -n 1 -host localhost test_mpi
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:1@CFD-PC11] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_maxes
[proxy:0:1@CFD-PC11] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_appnum
[proxy:0:1@CFD-PC11] PMI response: cmd=appnum appnum=1
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=PMI_process_mapping
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,1))
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=PMI_active_process_mapping
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=(vector,(0,2,0))
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get_my_kvsname
[proxy:0:1@CFD-PC11] PMI response: cmd=my_kvsname kvsname=kvs_4312_0
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=put kvsname=kvs_4312_0 key=OFI-1 value=OFI#0200CEC6C0A801330000000000000000$
[proxy:0:1@CFD-PC11] PMI response: cmd=put_result rc=0 msg=success
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=OFI-0
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CEC5C0A801330000000000000000$
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=get kvsname=kvs_4312_0 key=OFI-1
[proxy:0:1@CFD-PC11] PMI response: cmd=get_result rc=0 msg=success value=OFI#0200CEC6C0A801330000000000000000$
[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=barrier_in
[proxy:0:1@CFD-PC11] PMI response: cmd=barrier_out
 Hello world: rank            0  of            2  running on
 CFD-PC11


 Hello world: rank            1  of            2  running on
 CFD-PC11


[proxy:0:1@CFD-PC11] pmi cmd from fd 372: cmd=finalize
[proxy:0:1@CFD-PC11] PMI response: cmd=finalize_ack

C:\Users\CFD>

many thanks,

 

0 Kudos
Reply