<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Errors in TCP libfabric for Intel(R) Xeon(R) Platinum 8259CL in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Errors-in-TCP-libfabric-for-Intel-R-Xeon-R-Platinum-8259CL/m-p/1611809#M11780</link>
    <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/364734"&gt;@Green_James&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Please use the latest release, 2021.13.&lt;BR /&gt;&lt;BR /&gt;If you still encounter the error there, please try to provide a small and simple reproducer so that we can take a look at it. If the super computing center that you are using has a valid support contract, please use the priority support channel for your request. That way we have more means to help you.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 03 Jul 2024 08:59:14 GMT</pubDate>
    <dc:creator>TobiasK</dc:creator>
    <dc:date>2024-07-03T08:59:14Z</dc:date>
    <item>
      <title>Errors in TCP libfabric for Intel(R) Xeon(R) Platinum 8259CL</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Errors-in-TCP-libfabric-for-Intel-R-Xeon-R-Platinum-8259CL/m-p/1608864#M11759</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have been testing an electronic structure code on a supercomputer with Intel(R) Xeon(R) Platinum 8259CL and ethernet interconnect.&lt;/P&gt;&lt;P&gt;I have seen failures on multiple node calculations, which I believe is due to the interconnect/libfabric, as we have seen similar failures on other architectures and interconnects (e.g. EFA, mellanox) which could be resolved by appropriate choice of tuning file, see e.g. the post &lt;A href="https://community.intel.com/t5/Intel-MPI-Library/Differences-for-MPI-tuning-binaries-on-AMD-EPYC-773/m-p/1594821?profile.language=en" target="_self"&gt;here&lt;/A&gt;&lt;/P&gt;&lt;P&gt;However, for the ethernet/TCP libfabric, no choice of tuning file seems to remedy the situation.&lt;/P&gt;&lt;P&gt;The MPI debug output for the default choice is:&lt;/P&gt;&lt;LI-CODE lang="fortran"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2021.12 Build 20240213 (id: 4f55822)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: tcp
[48] MPI startup(): shm segment size (118 MB per rank) * (48 local ranks) = 5674 MB total
[0] MPI startup(): shm segment size (118 MB per rank) * (48 local ranks) = 5674 MB total
[0] MPI startup(): Load tuning file: "/work/shared/intel/mpi/2021.12/opt/mpi/etc/tuning_skx_shm-ofi_tcp.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 19 (TAG_UB value: 524287)
[0] MPI startup(): source bits available: 20 (Maximal number of rank: 1048575)
[0] MPI startup(): Number of NICs: 1 [0] MPI startup(): Intel(R) MPI Library, Version 2021.12 Build 20240213 (id: 4f55822)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1
[0] MPI startup(): libfabric version: 1.18.1-impi
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
[0] MPI startup(): libfabric provider: tcp
[48] MPI startup(): shm segment size (118 MB per rank) * (48 local ranks) = 5674 MB total
[0] MPI startup(): shm segment size (118 MB per rank) * (48 local ranks) = 5674 MB total
[0] MPI startup(): Load tuning file: "/work/shared/intel/mpi/2021.12/opt/mpi/etc/tuning_skx_shm-ofi_tcp.dat"
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: -1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: progress_threads: 0
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): tag bits available: 19 (TAG_UB value: 524287)
[0] MPI startup(): source bits available: 20 (Maximal number of rank: 1048575)
[0] MPI startup(): Number of NICs: 1 &lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;Does anyone have any idea what may be causing the issue/have any suggestions of anything else to try?&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;</description>
      <pubDate>Fri, 21 Jun 2024 14:37:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Errors-in-TCP-libfabric-for-Intel-R-Xeon-R-Platinum-8259CL/m-p/1608864#M11759</guid>
      <dc:creator>Green_James</dc:creator>
      <dc:date>2024-06-21T14:37:36Z</dc:date>
    </item>
    <item>
      <title>Re: Errors in TCP libfabric for Intel(R) Xeon(R) Platinum 8259CL</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Errors-in-TCP-libfabric-for-Intel-R-Xeon-R-Platinum-8259CL/m-p/1611809#M11780</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/364734"&gt;@Green_James&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;Please use the latest release, 2021.13.&lt;BR /&gt;&lt;BR /&gt;If you still encounter the error there, please try to provide a small and simple reproducer so that we can take a look at it. If the super computing center that you are using has a valid support contract, please use the priority support channel for your request. That way we have more means to help you.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 03 Jul 2024 08:59:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Errors-in-TCP-libfabric-for-Intel-R-Xeon-R-Platinum-8259CL/m-p/1611809#M11780</guid>
      <dc:creator>TobiasK</dc:creator>
      <dc:date>2024-07-03T08:59:14Z</dc:date>
    </item>
  </channel>
</rss>

