Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

MPI_HYDRA_BOOTSTRAP issue

youn__kihang
Novice
3,159 Views

Hello,

I am submitting a job through LSF Scheduler, and ssh setting is blocked with the nologin setting.
I am trying to connect LSF blaunch command, then I found that options associated with LSF.
Changing the I_MPI_HYDRA_BOOTSTRAP option from ssh to lsf seems to be solved.
But I tested the next four intel mpi, but I feel like it doesn't apply properly to the two x-marked mpi libraries.

2018.4.274: O
2019.2.187: X
2019.4.243: X
2019.5.281: O

Let me know if I'm missing something.

The options that I tried is below.

export I_MPI_HYDRA_BOOTSTRAP=lsf
export I_MPI_HYDRA_BOOTSTRAP_EXEC=lsf
export I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS=lsf
export I_MPI_HYDRA_RMK=lsf

And the error messesgs is below.

check_exit_codes (../hydra_demux_poll.c): unable to run proxy on hostname
poll_for_event (): check exit codes error
HYD_dmx_poll_wait_for_proxy_event (): poll for event error
HYD_bstrap_setup (): error waiting for event
main (): error setting up the boostrap proxies

Thanks

0 Kudos
3 Replies
youn__kihang
Novice
3,159 Views

I found a Q&A thread for the same issue as ours in the forum.
This was known issue and fixed in 2019u5, so we decide not to use 2019u2 and 2019u4.
Thanks all.
 

The original thread is attached.

https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/814696

0 Kudos
PrasanthD_intel
Moderator
3,159 Views

Hi Kihang,

Glad to know that you have got the information that you were looking for.
The issue has been fixed in the latest versions.
We are closing this thread. 
Please connect to us incase of any further queries.

 

Thanks 

Prasanth

0 Kudos
Shaikh__Samir
Beginner
2,393 Views

Hi,

I'm getting a similar issue with intel-2020.4.304 only for a large number of nodes (say 64+) on Cascadalake.

 

[mpiexec@cn001] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:121): unable to run bstrap_proxy (pid 5695, exit code 256)
[mpiexec@cn001] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@cn001] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@cn001] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:772): error waiting for event
[mpiexec@cn001] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1938): error setting up the boostrap proxies

0 Kudos
Reply