<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Hello World program cannot run on cluster in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1656354#M12033</link>
    <description>&lt;P&gt;Thanks. This helps me a lot. I change the OS of all nodes to Ubuntu20.04 and set FI_PROVIDER=tcp. It now runs well.&lt;/P&gt;</description>
    <pubDate>Mon, 13 Jan 2025 01:47:42 GMT</pubDate>
    <dc:creator>WangWJ</dc:creator>
    <dc:date>2025-01-13T01:47:42Z</dc:date>
    <item>
      <title>Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1651508#M12008</link>
      <description>&lt;P&gt;I have trouble when trying to run a hello world program on cluster.Here is my program:&lt;/P&gt;&lt;LI-CODE lang="cpp"&gt;#include"mpi.h"
#include&amp;lt;iostream&amp;gt;
int main(int argc, char *argv[])
{
    int myid,numprocs;
    MPI_Status status;
    MPI_Init(&amp;amp;argc,&amp;amp;argv);
    MPI_Comm_rank(MPI_COMM_WORLD,&amp;amp;myid);
    MPI_Comm_size(MPI_COMM_WORLD,&amp;amp;numprocs);
    std::cout&amp;lt;&amp;lt;"process: "&amp;lt;&amp;lt;myid&amp;lt;&amp;lt;" of "&amp;lt;&amp;lt;numprocs&amp;lt;&amp;lt;" hello world"&amp;lt;&amp;lt;std::endl;
    MPI_Finalize();
    return 0;
}&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;SPAN&gt;I have complied it by using&amp;nbsp; gxx by&lt;/SPAN&gt;&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;mpigxx main.cpp&lt;/LI-CODE&gt;&lt;P&gt;It runs ok on both host1 and host 2 when I use&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;mpirun -n 4 ./a.out&lt;/LI-CODE&gt;&lt;P&gt;But when I try to run on the cluster:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;mpirun -n 4 -ppn 2 -hosts host1,host2 ./a.out&lt;/LI-CODE&gt;&lt;P&gt;there is a problem with it:&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Unknown error class, error stack:
MPIR_Init_thread(193)........:
MPID_Init(1715)..............:
MPIDI_OFI_mpi_init_hook(1724):
MPIDU_bc_table_create(340)...: Missing hostname or invalid host/port description in business card
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Unknown error class, error stack:
MPIR_Init_thread(193)............:
MPID_Init(1715)..................:
MPIDI_OFI_mpi_init_hook(1739)....:
insert_addr_table_roots_only(492): OFI get address vector map failed&lt;/LI-CODE&gt;&lt;P&gt;Here are some more informations with I_MPI_DEBUG=10&lt;/P&gt;&lt;LI-CODE lang="bash"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2021.14  Build 20241121 (id: e7829d6)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.21.0-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: shm
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Unknown error class, error stack:
MPIR_Init_thread(193)........:
MPID_Init(1715)..............:
MPIDI_OFI_mpi_init_hook(1724):
MPIDU_bc_table_create(340)...: Missing hostname or invalid host/port description in business card&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;SPAN&gt;Is there anyone could help me with this problem?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 23 Dec 2024 01:47:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1651508#M12008</guid>
      <dc:creator>WangWJ</dc:creator>
      <dc:date>2024-12-23T01:47:32Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1652688#M12016</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/401694"&gt;@WangWJ&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN class="token punctuation"&gt;[&lt;/SPAN&gt;&lt;SPAN class="token number"&gt;0&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;]&lt;/SPAN&gt;&lt;SPAN&gt; MPI startup&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;(&lt;/SPAN&gt;&lt;SPAN class="token punctuation"&gt;)&lt;/SPAN&gt;&lt;SPAN&gt;: libfabric provider: shm&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Do you know who set this? can you please post the output of&amp;nbsp;&lt;BR /&gt;"export" ?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 27 Dec 2024 14:36:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1652688#M12016</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-12-27T14:36:01Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1652985#M12019</link>
      <description>&lt;P&gt;&lt;SPAN&gt;Libfabric provider&amp;nbsp;&lt;/SPAN&gt;was Automatically setted by MPI. I did not set any other MPI related&amp;nbsp;&lt;SPAN&gt;environment variables except I_MPI_DEBUG.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 30 Dec 2024 01:38:57 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1652985#M12019</guid>
      <dc:creator>WangWJ</dc:creator>
      <dc:date>2024-12-30T01:38:57Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1653726#M12020</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/401694"&gt;@WangWJ&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;can you provide more details on your environment, like OS/HW/SW?&lt;/P&gt;</description>
      <pubDate>Thu, 02 Jan 2025 11:49:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1653726#M12020</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2025-01-02T11:49:47Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1654523#M12024</link>
      <description>&lt;P&gt;host1:&lt;BR /&gt;OS: CentOS Linux release 7.5.1804&lt;BR /&gt;CPU: Intel Xeon Gold 6248R CPU @ 3.00GHz&amp;nbsp; &amp;nbsp;32cores&lt;BR /&gt;GCC:4.8.5&lt;BR /&gt;IntelMPI:Version 2021.14 Build 20241121&lt;/P&gt;&lt;P&gt;host2:&lt;BR /&gt;OS: Ubuntu20.04.6 LTS&lt;BR /&gt;CPU: Intel Xeon Gold 6248R CPU @ 3.00GHz&amp;nbsp; &amp;nbsp;32cores&lt;BR /&gt;GCC:9.4.0&lt;BR /&gt;IntelMPI:Version 2021.14 Build 20241121&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Both of them are HuaWeiYun cloud servers&lt;/P&gt;</description>
      <pubDate>Mon, 06 Jan 2025 03:40:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1654523#M12024</guid>
      <dc:creator>WangWJ</dc:creator>
      <dc:date>2025-01-06T03:40:50Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1655081#M12027</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/401694"&gt;@WangWJ&lt;/a&gt;&amp;nbsp;CentOS 7.5 is not supported anymore, additionally please use the same OS/SW stack on all nodes&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 11:15:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1655081#M12027</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2025-01-07T11:15:13Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1655118#M12028</link>
      <description>&lt;P&gt;Make sure that the hostnames host1 and host2 are correctly configured in your /etc/hosts file or DNS system, and that they can communicate with each other.&lt;/P&gt;&lt;P&gt;Verify that you can SSH from one node to the other (host1 to host2, and vice versa) without requiring a password. If passwordless SSH isn't set up, MPI won't be able to launch processes across nodes.&lt;/P&gt;&lt;P&gt;The error mentions "OFI" (which is part of the network fabric layer). Ensure that your cluster nodes have proper network configuration and are able to communicate via the correct interfaces.&lt;/P&gt;&lt;P&gt;The error could also be related to a mismatch in MPI versions or configuration. Ensure both nodes are using the same MPI library and version.&lt;/P&gt;&lt;P&gt;Your mpirun command looks fine, but you can try simplifying it to ensure it's not a syntax issue:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;mpirun -n 4 -hostfile hosts.txt ./a.out&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;In hosts.txt, list your hosts like:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;host1 slots=2
host2 slots=2&lt;/LI-CODE&gt;&lt;P&gt;&lt;BR /&gt;Double-check your environment variables (I_MPI_DEBUG, etc.) for any misconfiguration, as they can cause initialization errors.&lt;/P&gt;&lt;P&gt;Try these steps, and if it still doesn't work, providing more details about your network setup or MPI installation might help narrow down the issue.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jan 2025 15:10:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1655118#M12028</guid>
      <dc:creator>dusktilldawn</dc:creator>
      <dc:date>2025-01-07T15:10:15Z</dc:date>
    </item>
    <item>
      <title>Re: Hello World program cannot run on cluster</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1656354#M12033</link>
      <description>&lt;P&gt;Thanks. This helps me a lot. I change the OS of all nodes to Ubuntu20.04 and set FI_PROVIDER=tcp. It now runs well.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2025 01:47:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Hello-World-program-cannot-run-on-cluster/m-p/1656354#M12033</guid>
      <dc:creator>WangWJ</dc:creator>
      <dc:date>2025-01-13T01:47:42Z</dc:date>
    </item>
  </channel>
</rss>

