- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a four-node cluster connected by infiniband switch. I have a common NFS home directory for each node. The startup sequence for each node (bashrc, etc.) is thus identical. Let's call the nodes, node1, node2, and node3. I have disabled the firewalls on each server (which are running OS version of Rocky 8.7).
We have a PBS PRO (pbs_version = 2022.1.4.20231010124201
) queuing system and when we run jobs on a single node everything works fine, but when we try to run jobs in multinode mode we get the following errors:
check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on node1
poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1065): error waiting for event
HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1026): error setting up the bootstrap proxies
Also in cluster Intel® oneAPI HPC Toolkit 2023 installed.
I would be happy to provide additional information if needed.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in Intel communities!
Can you please try setting up the following Environment variable:
export I_MPI_HYDRA_IFACE="ib0"
After implementing this change, kindly execute the process in a multinode environment and share the results with us.
Additionally, we kindly request the following details for further investigation:
- Reproducer code.
- Recreation steps.
- Interconnect hardware details.
- FI_PROVIDER information.
- Logs generated after running the Intel® MPI Benchmark (IMB) with the same number of nodes as in the case where the issue occurred.
We thank you in advance for your cooperation.
Regards,
Veena
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Veena,
We are going to check what you wrote earlier and provide you with the results.
BR,
Jenya.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page