<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:Setting up the Intel® oneAPI MPI Library on a Linux cluster in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1327508#M8906</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh &lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Fri, 05 Nov 2021 11:18:43 GMT</pubDate>
    <dc:creator>SantoshY_Intel</dc:creator>
    <dc:date>2021-11-05T11:18:43Z</dc:date>
    <item>
      <title>Setting up the Intel® oneAPI MPI Library on a Linux cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1325996#M8888</link>
      <description>&lt;H3&gt;&lt;BR /&gt;TL;DR:&lt;/H3&gt;
&lt;H6&gt;&amp;nbsp;&lt;/H6&gt;
&lt;P&gt;How do I setup my system so that &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;mpirun&lt;/STRONG&gt;&lt;/FONT&gt; distributes processes over several hosts connected by a local network without throwing any errors? I suspect it fails because &lt;FONT color="#003300"&gt;&lt;STRONG&gt;&lt;FONT face="courier new,courier"&gt;/opt/intel/oneapi/setvars.sh&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/FONT&gt; is not sourced unless ssh connects to a login shell, but I do not know what the alternative recourse is.&lt;/P&gt;
&lt;H3&gt;LONGER VERSION:&lt;/H3&gt;
&lt;H6&gt;&amp;nbsp;&lt;/H6&gt;
&lt;P&gt;I have two machines on a LAN named &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;server-1&lt;/STRONG&gt;&lt;/FONT&gt; and &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;server-2&lt;/STRONG&gt;&lt;/FONT&gt; running Ubuntu 20.04.3 LTS. I have installed &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;intel-basekit&lt;/STRONG&gt;&lt;/FONT&gt; and &lt;FONT face="courier new,courier"&gt;&lt;STRONG&gt;&lt;FONT color="#003300"&gt;intel-hpckit&lt;/FONT&gt;&lt;/STRONG&gt;&lt;/FONT&gt; on both the machines, following the &lt;A href="https://www.intel.com/content/www/us/en/develop/documentation/installation-guide-for-intel-oneapi-toolkits-linux/top/installation/install-using-package-managers/apt.html#apt" target="_blank" rel="noopener"&gt;guidelines provided by Intel&lt;/A&gt;, and have modified &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;/etc/profile&lt;/STRONG&gt;&lt;/FONT&gt; so that &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;/opt/intel/oneapi/setvars.sh&lt;/STRONG&gt;&lt;/FONT&gt; gets sourced at every login. Furthermore, &lt;STRONG&gt;&lt;FONT face="courier new,courier" color="#003300"&gt;server-1&lt;/FONT&gt;&lt;/STRONG&gt; and &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;server-2&lt;/STRONG&gt;&lt;/FONT&gt; share the same home directory (&lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;server-2&lt;/STRONG&gt;&lt;/FONT&gt; auto-mounts &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;server-1:/home&lt;/STRONG&gt;&lt;/FONT&gt; as its own home at boot). Finally, passwordless ssh login is enabled.&lt;/P&gt;
&lt;P&gt;With this setup, the two machines can run MPI code independently, but cannot distribute the workload over the network. Since the Intel® oneAPI environment variables are only sourced at login, running Intel® oneAPI commands without a login shell (for instance &lt;STRONG&gt;&lt;FONT face="courier new,courier" color="#003300"&gt;ssh server-2 'mpirun -V'&lt;/FONT&gt;&lt;/STRONG&gt;) fails. I am not sure but I suspect this is the reason why I am getting errors when trying to distribute tasks over the two hosts.&lt;/P&gt;
&lt;P&gt;If I execute the command,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;mpirun -n 2 -ppn 1 -hosts server-1,server-2 hostname&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;then I get the following error.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="bash"&gt;[mpiexec@server-1] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on server-2 (pid 63079, exit code 768)
[mpiexec@server-1] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@server-1] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@server-1] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:1062): error waiting for event
[mpiexec@server-1] HYD_print_bstrap_setup_error_message (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:1015): error setting up the bootstrap proxies
[mpiexec@server-1] Possible reasons:
[mpiexec@server-1] 1. Host is unavailable. Please check that all hosts are available.
[mpiexec@server-1] 2. Cannot launch hydra_bstrap_proxy or it crashed on one of the hosts. Make sure hydra_bstrap_proxy is available on all hosts and it has right permissions.
[mpiexec@server-1] 3. Firewall refused connection. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable.
[mpiexec@server-1] 4. Ssh bootstrap cannot launch processes on remote host. Make sure that passwordless ssh connection is established across compute hosts.
[mpiexec@server-1] You may try using -bootstrap option to select alternative launcher.&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Do you have any ideas how I may set up the system properly? I have tried looking for a solution in the documentation provided by Intel, but I haven't been able to find something that addresses exactly this issue. Your help is much appreciated. Thank you.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;P.S.&lt;/STRONG&gt; I don't know if it helps but running the Intel® Cluster Checker 2021, &lt;FONT face="courier new,courier" color="#003300"&gt;&lt;STRONG&gt;clck -f hostfile&lt;/STRONG&gt;&lt;/FONT&gt;, I get a process that runs without stopping for hours. It gets stuck on "Running Collect..." and I run out of patience after some time.&lt;/P&gt;</description>
      <pubDate>Fri, 29 Oct 2021 21:36:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1325996#M8888</guid>
      <dc:creator>JR</dc:creator>
      <dc:date>2021-10-29T21:36:40Z</dc:date>
    </item>
    <item>
      <title>Re: Setting up the Intel® oneAPI MPI Library on a Linux cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1326539#M8891</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;From the debug log of your error, we found the below statement:&lt;/P&gt;
&lt;P&gt;&lt;I&gt;"[mpiexec@server-1] 3. &lt;STRONG&gt;Firewall refused connection&lt;/STRONG&gt;. Check that enough ports are allowed in the firewall and specify them with the I_MPI_PORT_RANGE variable."&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We assume that it is a firewall issue.&amp;nbsp; So, could you please check whether the status of either "firewalld" or "ufw" is enabled? If yes, disable it using the below link so that you could run the MPI program between two machines successfully.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For more information and guidelines please refer to the below link:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/developer/articles/technical/bkms-firewall-blocks-mpi-communication-among-nodes.html" target="_blank" rel="noopener"&gt;https://www.intel.com/content/www/us/en/developer/articles/technical/bkms-firewall-blocks-mpi-communication-among-nodes.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you are using "ufw", to check its status use the below command:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;sudo ufw status&lt;/LI-CODE&gt;
&lt;P&gt;If the status is active, then disable it using the below command:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;sudo ufw disable&lt;/LI-CODE&gt;
&lt;P&gt;To disable the ufw on Linux at boot time, use the below command:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;sudo systemctl disable ufw&lt;/LI-CODE&gt;
&lt;P&gt;Verify whether ufw is disabled:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;sudo ufw status
sudo systemctl status ufw&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You can now run MPI programs between two machines successfully.&lt;/P&gt;
&lt;P&gt;If this resolves your issue, make sure to accept this as a solution. This would help others with a similar issue.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 02 Nov 2021 07:01:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1326539#M8891</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-11-02T07:01:19Z</dc:date>
    </item>
    <item>
      <title>Re: Setting up the Intel® oneAPI MPI Library on a Linux cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1326855#M8893</link>
      <description>&lt;P&gt;Thank you for your answer. Flushing all the iptable rules does solve this issue. Is there a firewall-friendly solution as well?&lt;/P&gt;</description>
      <pubDate>Wed, 03 Nov 2021 11:03:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1326855#M8893</guid>
      <dc:creator>JR</dc:creator>
      <dc:date>2021-11-03T11:03:35Z</dc:date>
    </item>
    <item>
      <title>Re: Setting up the Intel® oneAPI MPI Library on a Linux cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1327488#M8903</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Glad to know that your issue is resolved.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your debug log suggests that the MPI ranks cannot communicate with each other, because the firewall blocks the MPI communication.&lt;/P&gt;
&lt;P&gt;In the below link, there are three methods to help you solve this problem in which method 2 and method 3 are firewall-friendly.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://www.intel.com/content/www/us/en/developer/articles/technical/bkms-firewall-blocks-mpi-communication-among-nodes.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/developer/articles/technical/bkms-firewall-blocks-mpi-communication-among-nodes.html&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Nov 2021 09:33:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1327488#M8903</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-11-05T09:33:01Z</dc:date>
    </item>
    <item>
      <title>Re:Setting up the Intel® oneAPI MPI Library on a Linux cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1327508#M8906</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for accepting our solution. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 05 Nov 2021 11:18:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Setting-up-the-Intel-oneAPI-MPI-Library-on-a-Linux-cluster/m-p/1327508#M8906</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-11-05T11:18:43Z</dc:date>
    </item>
  </channel>
</rss>

