<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Oneapi issue on cluster (Unable to run bstrap_proxy) in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561892#M11403</link>
    <description>&lt;P&gt;Thanks for the reply&lt;BR /&gt;&lt;BR /&gt;I will look into the prerequisites and get back to you.&lt;BR /&gt;&lt;BR /&gt;Its a password less ssh between the nodes and I am submitting my Job using PBS (pbs_version = 20.0.1).&lt;/P&gt;</description>
    <pubDate>Thu, 11 Jan 2024 15:13:49 GMT</pubDate>
    <dc:creator>Mehul2</dc:creator>
    <dc:date>2024-01-11T15:13:49Z</dc:date>
    <item>
      <title>Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1560617#M11367</link>
      <description>&lt;P&gt;Hello&lt;BR /&gt;&lt;BR /&gt;I have trying to run SU2 CFD solver on my lab's cluster using the oneapi/2022.3 mpi. However I am facing an error whenever I am trying to run on more than one nodes. I found some similar errors on the forum, however none of the solution mentioned worked for me. Kindly help me in this regard. I am attaching the error I am encountering.&lt;BR /&gt;&lt;BR /&gt;[mpiexec@cn031] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on cn032 (pid 68544, exit code 256)&lt;BR /&gt;[mpiexec@cn031] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error&lt;BR /&gt;[mpiexec@cn031] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error&lt;BR /&gt;[mpiexec@cn031] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1061): error waiting for event&lt;BR /&gt;[mpiexec@cn031] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1027): error setting up the bootstrap proxies&lt;BR /&gt;[mpiexec@cn031] Possible reasons:&lt;BR /&gt;[mpiexec@cn031] 1. Host is unavailable. Please check that all hosts are available.&lt;BR /&gt;[mpiexec@cn031] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.&lt;BR /&gt;[mpiexec@cn031] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.&lt;BR /&gt;[mpiexec@cn031] 4. pbs bootstrap cannot launch processes on remote host. You may try using -bootstrap option to select alternative launcher.&lt;BR /&gt;cp: cannot stat ‘restart_flow.dat’: No such file or directory&lt;BR /&gt;Abort(1) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0&lt;/P&gt;</description>
      <pubDate>Sun, 07 Jan 2024 13:53:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1560617#M11367</guid>
      <dc:creator>Mehul2</dc:creator>
      <dc:date>2024-01-07T13:53:50Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561843#M11391</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/332478"&gt;@Mehul2&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Did you follow the prerequisites listed here?&lt;BR /&gt;&lt;A href="https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-11/installation-and-prerequisites.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-11/installation-and-prerequisites.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You have to make sure that you can either use password-less ssh between all nodes of the cluster or set up a workload manager like Slurm, PBSpro, etc.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2024 11:45:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561843#M11391</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-01-11T11:45:28Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561892#M11403</link>
      <description>&lt;P&gt;Thanks for the reply&lt;BR /&gt;&lt;BR /&gt;I will look into the prerequisites and get back to you.&lt;BR /&gt;&lt;BR /&gt;Its a password less ssh between the nodes and I am submitting my Job using PBS (pbs_version = 20.0.1).&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2024 15:13:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561892#M11403</guid>
      <dc:creator>Mehul2</dc:creator>
      <dc:date>2024-01-11T15:13:49Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561893#M11404</link>
      <description>&lt;P&gt;Please try:&lt;BR /&gt;I_MPI_HYDRA_BOOTSTRAP=ssh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 11 Jan 2024 15:16:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1561893#M11404</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-01-11T15:16:03Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1562244#M11411</link>
      <description>&lt;P&gt;I have tried to export I_MPI_HYDRA_BOOTSTRAP=ssh.&lt;BR /&gt;&lt;BR /&gt;Now no error is being produced but the simulation job is not producing any results. Its shows running status by qstat command. When I logged-in to the allotted nodes by PBS to the job and ran "top" command, it shows no job running on the allotted nodes.&lt;BR /&gt;&lt;BR /&gt;I am attaching the Job Script file I am using to run the simulation.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2024 11:58:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1562244#M11411</guid>
      <dc:creator>Mehul2</dc:creator>
      <dc:date>2024-01-12T11:58:48Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1562267#M11412</link>
      <description>&lt;P&gt;Sorry without any error message there is little I can do.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;P&gt;Please try to run the IMB-MPI1 benchmarks, if those succeed then the problem is somewhere else in your configuration.&lt;BR /&gt;&lt;BR /&gt;mpirun -np 512 IMB-MPI1&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can also add I_MPI_DEBUG=10 to get some more debug output.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Jan 2024 14:30:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1562267#M11412</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-01-12T14:30:35Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1563544#M11429</link>
      <description>&lt;P&gt;Hello&lt;BR /&gt;Sorry for the late reply.&lt;BR /&gt;I ran the benchmark you mentioned using the job script:&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;#!/bin/bash&lt;BR /&gt;#PBS -N SU2_ROTOR_n4.128&lt;BR /&gt;#PBS -q AMD_Q&lt;BR /&gt;#PBS -l select=4:ncpus=128&lt;BR /&gt;#PBS -l walltime=96:00:00&lt;BR /&gt;#PBS -o /work/home/bakhshi/SU2/Test/Rotor_Scaleup/multi_nodes/n4/cpp128&lt;BR /&gt;#PBS -e /work/home/bakhshi/SU2/Test/Rotor_Scaleup/multi_nodes/n4/cpp128&lt;/P&gt;&lt;P&gt;cd $PBS_O_WORKDIR&lt;/P&gt;&lt;P&gt;NODEFILE=$PBS_NODEFILE&lt;BR /&gt;PPN=$(cat $NODEFILE | wc -l)&lt;/P&gt;&lt;P&gt;module purge;&lt;BR /&gt;module load oneapi/2022.3/mpi/latest;&lt;BR /&gt;module load compilers/gcc/13.2.0;&lt;BR /&gt;module load anaconda3/2021.11;&lt;BR /&gt;#export I_MPI_HYDRA_IFACE="ib0"&lt;/P&gt;&lt;P&gt;echo $PPN&lt;BR /&gt;eval "$(conda shell.bash hook)";&lt;/P&gt;&lt;P&gt;export SU2_HOME=/work/home/bakhshi/SU2/SU2-Install&lt;/P&gt;&lt;P&gt;echo $NODEFILE&lt;BR /&gt;echo $PPN&lt;/P&gt;&lt;P&gt;export I_MPI_HYDRA_BOOTSTRAP=ssh&lt;BR /&gt;mpirun -np 512 IMB-MPI1 I_MPI_DEBUG=10&lt;BR /&gt;&lt;BR /&gt;However, the job is just submitted and showing running status, although its not running on the nodes or producing any output.&lt;/P&gt;</description>
      <pubDate>Wed, 17 Jan 2024 11:02:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1563544#M11429</guid>
      <dc:creator>Mehul2</dc:creator>
      <dc:date>2024-01-17T11:02:50Z</dc:date>
    </item>
    <item>
      <title>Re:Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1563918#M11432</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/332478"&gt;@Mehul2&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;please make sure to use the latest MPI release and a supported OS.&lt;/P&gt;&lt;P&gt;Please also make sure to set a clean environment without conda or anything on top.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 18 Jan 2024 09:09:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1563918#M11432</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-01-18T09:09:32Z</dc:date>
    </item>
    <item>
      <title>Re: Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1564311#M11439</link>
      <description>&lt;P&gt;Hello&lt;BR /&gt;&lt;BR /&gt;I cleaned the environment, reinstalled the solver with the mpi and used I_MPI_HYDRA_BOOTSTRAP=ssh during job submission, which removed the &lt;SPAN class=""&gt;bstrap_proxy&lt;/SPAN&gt; error. However, I am facing another write error:&lt;BR /&gt;&lt;BR /&gt;[mpiexec@cn031] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:362): write error (Bad file descriptor)&lt;BR /&gt;[mpiexec@cn031] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:362): write error (Bad file descriptor)&lt;BR /&gt;[mpiexec@cn031] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:554): downstream from host cn032 exited with status 255&lt;BR /&gt;[mpiexec@cn031] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:554): downstream from host cn033 exited with status 255&lt;/P&gt;</description>
      <pubDate>Fri, 19 Jan 2024 10:58:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1564311#M11439</guid>
      <dc:creator>Mehul2</dc:creator>
      <dc:date>2024-01-19T10:58:29Z</dc:date>
    </item>
    <item>
      <title>Re:Oneapi issue on cluster (Unable to run bstrap_proxy)</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1565413#M11455</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/332478"&gt;@Mehul2&lt;/a&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Try to verify your PBS system is set up correctly, e.g. by running something like "hostname" through the batch system on all nodes.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;This error is described in our troubleshooting guide and usually it refers to a problem with your cluster setup - something I can not help you with.&lt;/P&gt;&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-11/error-message-bad-file-descriptor.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/docs/mpi-library/developer-guide-linux/2021-11/error-message-bad-file-descriptor.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;As soon as you fixed your cluster setup, I would advice to first use the IMB-MPI1 benchmarks before trying your solver.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 23 Jan 2024 14:28:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Oneapi-issue-on-cluster-Unable-to-run-bstrap-proxy/m-p/1565413#M11455</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-01-23T14:28:29Z</dc:date>
    </item>
  </channel>
</rss>

