- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am submitting a job through LSF Scheduler, and ssh setting is blocked with the nologin setting.
I am trying to connect LSF blaunch command, then I found that options associated with LSF.
Changing the I_MPI_HYDRA_BOOTSTRAP option from ssh to lsf seems to be solved.
But I tested the next four intel mpi, but I feel like it doesn't apply properly to the two x-marked mpi libraries.
2018.4.274: O
2019.2.187: X
2019.4.243: X
2019.5.281: O
Let me know if I'm missing something.
The options that I tried is below.
export I_MPI_HYDRA_BOOTSTRAP=lsf
export I_MPI_HYDRA_BOOTSTRAP_EXEC=lsf
export I_MPI_HYDRA_BOOTSTRAP_EXEC_EXTRA_ARGS=lsf
export I_MPI_HYDRA_RMK=lsf
And the error messesgs is below.
check_exit_codes (../hydra_demux_poll.c): unable to run proxy on hostname
poll_for_event (): check exit codes error
HYD_dmx_poll_wait_for_proxy_event (): poll for event error
HYD_bstrap_setup (): error waiting for event
main (): error setting up the boostrap proxies
Thanks
- Tags:
- Cluster Computing
- General Support
- Intel® Cluster Ready
- Message Passing Interface (MPI)
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found a Q&A thread for the same issue as ours in the forum.
This was known issue and fixed in 2019u5, so we decide not to use 2019u2 and 2019u4.
Thanks all.
The original thread is attached.
https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology/topic/814696
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kihang,
Glad to know that you have got the information that you were looking for.
The issue has been fixed in the latest versions.
We are closing this thread.
Please connect to us incase of any further queries.
Thanks
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm getting a similar issue with intel-2020.4.304 only for a large number of nodes (say 64+) on Cascadalake.
[mpiexec@cn001] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:121): unable to run bstrap_proxy (pid 5695, exit code 256)
[mpiexec@cn001] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@cn001] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@cn001] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:772): error waiting for event
[mpiexec@cn001] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1938): error setting up the boostrap proxies
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page