- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, when submitting a sample job to echo a mpi rank, same error occurred. How can I solve this error?
# Run test program
mpirun -np 16 -ppn 4 -genv I_MPI_OFI_PROVIDER=mlx -genv I_MPI_FABRICS=shm:ofi -genv I_MPI_HYDRA_TOPOLIB=ipl -genv I_MPI_DEBUG=10 -genv UCX_TLS=dc,xpmem,self -genv UCX_IB_MLX5_DEVX=no ./mpitest
Thank you.
---------- Error message ----------
[mpiexec@ip-0A000205] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy o
n ip-0a000207.fy4jfpvyh1rutk34ayjxyehmbh.lx.internal.cloudapp.net (pid 17555, exit code 256)
[mpiexec@ip-0A000205] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@ip-0A000205] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for ev
ent error
[mpiexec@ip-0A000205] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1061): error waiting for
event
[mpiexec@ip-0A000205] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1027): error setting u
p the bootstrap proxies
[mpiexec@ip-0A000205] Possible reasons:
[mpiexec@ip-0A000205] 1. Host is unavailable. Please check that all hosts are available.
[mpiexec@ip-0A000205] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on
all hosts and it has right permissions.
[mpiexec@ip-0A000205] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_
PORT_RANGE variable.
[mpiexec@ip-0A000205] 4. pbs bootstrap cannot launch processes on remote host. You may try using -bootstrap option to select alternative l
auncher.
-----------------------------------------
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
Thank you for reaching out to the Intel communities.
Could you please provide us with the sample reproducer code and steps to reproduce the issue at our end?
Could you also please provide us with the OS and hardware details?
Thanks And Regards,
Aishwarya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Aishwarya,
The environment is as below.
- The OS of the head node and compute nodes is RHEL8.3.
- Job scheduler is OpenPBS v20.
- InteI MPI version is 2021.8.0
- Hardware is HB120rs_v3 on Microsoft Azure.
https://learn.microsoft.com/en-us/azure/virtual-machines/hbv3-series
The sample script is as below. It is quite a simple code to show MPI ranks.
----------------------------
#!/bin/bash
#PBS -N TESTJOB01
#PBS -l select=2:ncpus=120:mpiprocs=2:ompthreads=1
#PBS -q q3
#PBS -j oe
# Source oneAPI environment
. /anf-stg01/apps/intel/oneapi/setvars.sh
# Set PATH
PATH=/anf-stg01/user01/work/testjob:$PATH export PATH
# Move to current directory
cd $PBS_O_WORKDIR
#cat $PBS_NODEFILE
NCPU=$(wc -l $PBS_NODEFILE | awk '{print $1}')
NNODE=$(sort $PBS_NODEFILE | uniq | wc -l )
PROC_PER_NODE=$(expr $NCPU / $NNODE )
# Run test program
mpirun -np $NCPU -ppn $PROC_PER_NODE -genv I_MPI_OFI_PROVIDER=mlx -genv I_MPI_FABRICS=shm:ofi -genv I_MPI_HYDRA_TOPOLIB=ipl -genv I_MPI_DEBUG=10 -genv UCX_TLS=dc,xpmem,self -genv UCX_IB_MLX5_DEVX=no ./mpitest
Regards
Kai77
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have run your code with Intel processor: Intel(R) Xeon(R) Gold 6346
We were successfully able to run the simple code to show MPI ranks, find the code in attached zip file.
NOTE: Could you please try to run the code with Intel Processors?
Can find the supported Intel Processors here: https://www.intel.in/content/www/in/en/support/products/873/processors.html#122139
Could you please check and let us know , If the issue still persists with Intel Processors?
Thanks And Regards,
Aishwarya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Aishwarya,
Thanks for the reply. I will check it again.
Regards
Kai77
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We haven't heard back from you, Could you please let us know if you are able to run the code with Intel Processors?
Thanks & Regards,
Aishwarya
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
We have not heard back from you. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks and Regards,
Aishwarya
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page