<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:Intel MPI_Alltoallw Poor Performance in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1548412#M11192</link>
    <description>&lt;P&gt;Closing this case due to inactivity. This issue is assumed to be resolved and we will no longer respond to this thread.&amp;nbsp;If you require additional assistance from Intel, please start a new thread.&amp;nbsp;Any further interaction in this thread will be considered community only.&lt;/P&gt;&lt;P&gt; &lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Tue, 28 Nov 2023 17:30:25 GMT</pubDate>
    <dc:creator>DrAmarpal_K_Intel</dc:creator>
    <dc:date>2023-11-28T17:30:25Z</dc:date>
    <item>
      <title>Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1190685#M6900</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;We are developing a project that uses MPI for distributed execution. We need executables both for Windows and Linux. In windows we were using Microsoft MPI and decided to switch to Intel's implementation. Unfortunately, we saw a drop in performance on some specific cases. After some investigation we found that the problem is located to MPI_Alltoallw().&lt;/P&gt;
&lt;P&gt;After searching, I found that Intel's MPI_Alltoallw() is a naive implementation of Isend/Irecv and has no alternatives for tuning like other collectives.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To prove our results, I have created a C++ demo program using alltoallw(). Every processor contains a buffer glob_rows * cols and sends common_rows * cols to the others. That means that every processor at the end will have a common_rows * cols * comm_size buffer filled. Common_rows are calculated using cyclic_block distribution.&lt;/P&gt;
&lt;P&gt;I compiled and ran both with Intel MPI 2019.7.216 and Microsoft MPI. The execution times on intel i5 4460 are:&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;#pr&amp;nbsp; &amp;nbsp;msmpi&amp;nbsp; &amp;nbsp; &amp;nbsp;impi&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1.69s&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;4.01s&lt;/SPAN&gt;&lt;BR /&gt;&lt;SPAN&gt;4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3.27s&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7.15s&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;I know that the specific demo can be solved using alltoallv or maybe even alltoall. The problem is that we use alltoallw a lot and it's performance is really important.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;We didnt expect Intel's MPI to be slower than MSMPI. Do you have any tips? Is there any chance for MPI developers to improve Alltoallw()?&lt;BR /&gt;&lt;BR /&gt;Thank you in advance guys!&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;For some reason I cannot upload my cpp file, this is the code&lt;/SPAN&gt;&lt;/P&gt;
&lt;LI-CODE lang="cpp"&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;random&amp;gt;
#include &amp;lt;mpi.h&amp;gt;
#include &amp;lt;memory&amp;gt;
#include &amp;lt;algorithm&amp;gt;


int main(int argc, char* argv[]){
  MPI_Init(&amp;amp;argc,&amp;amp;argv);

  int comm_size, comm_rank;
  MPI_Comm_size(MPI_COMM_WORLD,&amp;amp;comm_size);
  MPI_Comm_rank(MPI_COMM_WORLD,&amp;amp;comm_rank);

  // allocate initial buffer and fill it with random doubles
  int rows = 1 &amp;lt;&amp;lt; 20;
  int cols = 50;
  size_t size = (size_t)rows * (size_t)cols;
  auto val = std::make_unique&amp;lt;double[]&amp;gt;(size);

  std::uniform_real_distribution&amp;lt;double&amp;gt; unif;
  std::default_random_engine re;
  std::generate(val.get(), val.get()+size, [&amp;amp;](){return unif(re);});

  // calculate common rows that each processor will have
  int total_blks = rows / 64;
  int cm_blks = total_blks / comm_size;
  int cm_rows = cm_blks * 64;

  // final buffer for each processor. Each processor will receive cm_rows * cols
  int cols_f = cols * comm_size;
  auto b_val = std::make_unique&amp;lt;double[]&amp;gt;(cm_rows * cols_f);

  // Create datatypes
  MPI_Datatype scol,scol_res,sblock,rblock;
  MPI_Type_vector(cm_blks,64,64*comm_size,MPI_DOUBLE,&amp;amp;scol);
  MPI_Type_create_resized(scol,0,rows*sizeof(double),&amp;amp;scol_res);
  MPI_Type_contiguous(cols,scol_res,&amp;amp;sblock);
  MPI_Type_contiguous(cm_rows*cols,MPI_DOUBLE,&amp;amp;rblock);
  MPI_Type_commit(&amp;amp;sblock);
  MPI_Type_commit(&amp;amp;rblock);

  std::vector&amp;lt;int&amp;gt; scounts(comm_size,1);
  std::vector&amp;lt;int&amp;gt; rcounts(comm_size,1);
  std::vector&amp;lt;int&amp;gt; sdispls(comm_size);
  std::vector&amp;lt;int&amp;gt; rdispls(comm_size);
  std::vector&amp;lt;MPI_Datatype&amp;gt; stypes(comm_size);
  std::vector&amp;lt;MPI_Datatype&amp;gt; rtypes(comm_size);

  for (int i=0;i&amp;lt;comm_size;i++){
    sdispls[i] = 64*i*sizeof(double);
    rdispls[i] = cm_rows*cols*i*sizeof(double);
    stypes[i] = sblock;
    rtypes[i] = rblock;
  }

  MPI_Barrier(MPI_COMM_WORLD);
  double str = MPI_Wtime();
  for (int i=0;i&amp;lt;10;i++) MPI_Alltoallw(val.get(),scounts.data(),sdispls.data(),stypes.data(),b_val.get(),rcounts.data(),rdispls.data(),rtypes.data(),MPI_COMM_WORLD);
  MPI_Barrier(MPI_COMM_WORLD);
  if (!comm_rank) printf("Time: %lf seconds\n",MPI_Wtime()-str);

  MPI_Finalize();
  return 0;
}&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jul 2020 15:12:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1190685#M6900</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-07-08T15:12:30Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1190994#M6907</link>
      <description>&lt;P&gt;Hi Michail,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for connecting to us.&lt;/P&gt;&lt;P&gt;Yes, the AlltoAllw has only a Isend/Irecv waitall implementation in IMPI.&lt;/P&gt;&lt;P&gt;We also observed similar timings for the given program as you have reported for IMPI.&lt;/P&gt;&lt;P&gt;We are forwarding your query to the concerned team and will get back to you at earliest.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Prasanth&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 09 Jul 2020 11:59:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1190994#M6907</guid>
      <dc:creator>PrasanthD_intel</dc:creator>
      <dc:date>2020-07-09T11:59:14Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192140#M6922</link>
      <description>&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jul 2020 07:36:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192140#M6922</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-07-14T07:36:36Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192205#M6924</link>
      <description>&lt;P&gt;Hi Michail,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;On how many nodes do you observe this behavior?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Amar &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 14 Jul 2020 11:49:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192205#M6924</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2020-07-14T11:49:11Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192234#M6927</link>
      <description>&lt;P&gt;Hi DrAmarpal,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;We are currently running on SMP mode on 1 node. Soon we are going to use 4-8 nodes max.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;Michail&lt;/P&gt;</description>
      <pubDate>Tue, 14 Jul 2020 14:27:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192234#M6927</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-07-14T14:27:08Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192884#M6936</link>
      <description>&lt;P&gt;Hi Michail,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for confirming. Please hold on for a solution on this.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Amar  &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 16 Jul 2020 10:50:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1192884#M6936</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2020-07-16T10:50:19Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1199012#M7001</link>
      <description>&lt;P&gt;Hi Michail,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please rerun your experiments with Intel MPI Library 2019 U8 that was recently released? With this version please set &lt;SPAN style="font-family: &amp;quot;Segoe UI&amp;quot;, sans-serif; font-size: 10pt;"&gt;FI_PROVIDER=netdir and report your findings.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: &amp;quot;Segoe UI&amp;quot;, sans-serif; font-size: 10pt;"&gt;Best regards,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: &amp;quot;Segoe UI&amp;quot;, sans-serif; font-size: 10pt;"&gt;Amar &lt;/SPAN&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 10 Aug 2020 12:58:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1199012#M7001</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2020-08-10T12:58:07Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201235#M7031</link>
      <description>&lt;P&gt;Hi DrAmarpal,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I downloaded&amp;nbsp;&lt;SPAN&gt;Intel MPI Library 2019 U8 and compiled my code with it. Using&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;I_MPI_FABRICS=ofi
FI_PROVIDER=netdir&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;I get an error which says&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2019 Update 8  Build 20200624
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.10.1a1-impi
Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;If I use&amp;nbsp;&lt;EM&gt;FI_PROVIDER=tcp&lt;/EM&gt;, it works but the execution time is still large.&amp;nbsp; With&amp;nbsp;&lt;EM&gt;FI_PROVIDER=shm&lt;/EM&gt; the execution time is quite better but not what we want compared to MS_MPI.&lt;/P&gt;
&lt;P&gt;I am runing on a single node. The changes were focused for inter-node communication?&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;/P&gt;
&lt;P&gt;Jason&lt;/P&gt;</description>
      <pubDate>Tue, 18 Aug 2020 12:13:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201235#M7031</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-08-18T12:13:42Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201538#M7033</link>
      <description>&lt;P&gt;Hi Jason,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for reporting your findings. To understand what the problem is, could you please source the debug version of the Intel MPI library by running,&lt;/P&gt;&lt;P&gt;mpivars.bat debug&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;and &lt;SPAN style="font-family: Calibri, sans-serif; font-size: 11pt;"&gt;set FI_LOG_LEVEL=debug&amp;nbsp; before running your test. Please share the additional output that gets generated during this run.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Amar&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Calibri, sans-serif; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 19 Aug 2020 06:50:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201538#M7033</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2020-08-19T06:50:07Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201730#M7044</link>
      <description>&lt;P&gt;Dear Amar,&lt;/P&gt;
&lt;P&gt;I followed your instructions. Running with 2 or 4 processors, the output is pretty short.&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)&lt;/LI-CODE&gt;
&lt;P&gt;Using 1 processor I get the following,&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;Abort(1091215) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)
libfabric:476:core:mr:ofi_default_cache_size():56&amp;lt;info&amp;gt; default cache size=0
libfabric:476:netdir:core:ofi_nd_startup():602&amp;lt;info&amp;gt; ofi_nd_startup: starting initialization
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: netdir (110.10)
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_rxm (110.10)
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: sockets (110.10)
libfabric:476:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "sockets" filtered by provider include/exclude list, skipping
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: tcp (110.10)
libfabric:476:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "tcp" filtered by provider include/exclude list, skipping
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_perf (110.10)
libfabric:476:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_noop (110.10)
libfabric:476:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by ofi_rxm provider
libfabric:476:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by netdir provider&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Using 4 processors and TCP I get,&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;libfabric:7532:core:mr:ofi_default_cache_size():56&amp;lt;info&amp;gt; default cache size=0
libfabric:7532:netdir:core:ofi_nd_startup():602&amp;lt;info&amp;gt; ofi_nd_startup: starting initialization
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: netdir (110.10)
libfabric:7532:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "netdir" filtered by provider include/exclude list, skipping
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_rxm (110.10)
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: sockets (110.10)
libfabric:7532:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "sockets" filtered by provider include/exclude list, skipping
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: tcp (110.10)
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_perf (110.10)
libfabric:7532:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_noop (110.10)
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_libfabric:2492:core:mr:ofi_default_cache_size():56&amp;lt;info&amp;gt; default cache size=0
libfabric:2492:netdir:core:ofi_nd_startup():602&amp;lt;info&amp;gt; ofi_nd_startup: starting initialization
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: netdir (110.10)
libfabric:2492:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "netdir" filtered by provider include/exclude list, skipping
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_rxm (110.10)
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: sockets (110.10)
libfabric:2492:core:core:ofi_register_provider():446&amp;lt;info&amp;gt; "sockets" filtered by provider include/exclude list, skipping
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: tcp (110.10)
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_perf (110.10)
libfabric:2492:core:core:ofi_register_provider():418&amp;lt;info&amp;gt; registering provider: ofi_hook_noop (110.10)
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:fi_getinfo():1051&amp;lt;warn&amp;gt; Can't find provider with the highest priority
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:tcp:core:ofi_check_ep_type():654&amp;lt;info&amp;gt; unsupported endpoint type
libfabric:2492:tcp:core:ofi_check_ep_type():655&amp;lt;info&amp;gt; Supported: FI_EP_MSG
libfabric:2492:tcp:core:ofi_check_ep_type():655&amp;lt;info&amp;gt; Requested: FI_EP_RDM
libfabric:2492:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by tcp provider
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: :addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:fi_getinfo():1051&amp;lt;warn&amp;gt; Can't find provider with the highest priority
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915&amp;lt;info&amp;gt; Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:tcp:core:ofi_check_ep_type():654&amp;lt;info&amp;gt; unsupported endpoint type
libfabric:7532:tcp:core:ofi_check_ep_type():655&amp;lt;info&amp;gt; Supported: FI_EP_MSG
libfabric:7532:tcp:core:ofi_check_ep_type():655&amp;lt;info&amp;gt; Requested: FI_EP_RDM
libfabric:7532:core:core:fi_getinfo():1129&amp;lt;info&amp;gt; Now it is being used by tcp provider
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: :Time: 21.775112 seconds
 fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_fabric():1346&amp;lt;info&amp;gt; Opened fabric: 10.0.2.0/24
libfabric:2492:core:core:fi_fabric():1346&amp;lt;info&amp;gt; Opened fabric: 10.0.2.0/24
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:av:util_verify_av_attr():474&amp;lt;warn&amp;gt; Shared AV is unsupported
libfabric:2492:ofi_rxm:av:util_av_init():446&amp;lt;info&amp;gt; AV size 1024
libfabric:2492:ofi_rxm:core:ofi_check_fabric_attr():403&amp;lt;info&amp;gt; Requesting provider verbs, skipping tcp;ofi_rxm
libfabric:2492:ofi_rxm:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:ofi_rxm:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:ofi_rxm:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():766&amp;lt;info&amp;gt; Tag size exceeds supported size
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():767&amp;lt;info&amp;gt; Supported: 6148914691236517205
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():767&amp;lt;info&amp;gt; Requested: -6148914691236517206
libfabric:2492:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:rxm_ep_settings_init():2440&amp;lt;info&amp;gt; Settings:
                 MR local: MSG - 0, RxM - 0
                 Completions per progress: MSG - 1
                 Buffered min: 0
                 Min multi recv size: 16320
                 FI_EP_MSG provider inject size: 64
                 rxm inject size: 16320
                 Protocol limits: Eager: 16320, SAR: 131072
libfabric:2492:ofi_rxm:core:rxm_ep_setopt():587&amp;lt;info&amp;gt; FI_OPT_MIN_MULTI_RECV set to 16384
libfabric:2492:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:ep_ctrl:rxm_cmap_free():684&amp;lt;info&amp;gt; Closing cmap
libfabric:2492:ofi_rxm:ep_ctrl:rxm_cmap_cm_thread_close():658&amp;lt;info&amp;gt; stopping CM thread
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (568) not found in wait list - 00000000000C8210
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (560) not found in wait list - 00000000000C8210
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (564) not found in wait list - 00000000000C8210
 fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312&amp;lt;info&amp;gt; Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_fabric():1346&amp;lt;info&amp;gt; Opened fabric: 10.0.2.0/24
libfabric:7532:core:core:fi_fabric():1346&amp;lt;info&amp;gt; Opened fabric: 10.0.2.0/24
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:av:util_verify_av_attr():474&amp;lt;warn&amp;gt; Shared AV is unsupported
libfabric:7532:ofi_rxm:av:util_av_init():446&amp;lt;info&amp;gt; AV size 1024
libfabric:7532:ofi_rxm:core:ofi_check_fabric_attr():403&amp;lt;info&amp;gt; Requesting provider verbs, skipping tcp;ofi_rxm
libfabric:7532:ofi_rxm:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:ofi_rxm:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:ofi_rxm:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():766&amp;lt;info&amp;gt; Tag size exceeds supported size
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():767&amp;lt;info&amp;gt; Supported: 6148914691236517205
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():767&amp;lt;info&amp;gt; Requested: -6148914691236517206
libfabric:7532:core:core:fi_getinfo():1066&amp;lt;info&amp;gt; Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255&amp;lt;info&amp;gt; Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114&amp;lt;info&amp;gt; available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:rxm_ep_settings_init():2440&amp;lt;info&amp;gt; Settings:
                 MR local: MSG - 0, RxM - 0
                 Completions per progress: MSG - 1
                 Buffered min: 0
                 Min multi recv size: 16320
                 FI_EP_MSG provider inject size: 64
                 rxm inject size: 16320
                 Protocol limits: Eager: 16320, SAR: 131072
libfabric:7532:ofi_rxm:core:rxm_ep_setopt():587&amp;lt;info&amp;gt; FI_OPT_MIN_MULTI_RECV set to 16384
libfabric:7532:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:tcp:core:ofi_check_rx_attr():782&amp;lt;info&amp;gt; Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880&amp;lt;info&amp;gt; Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:ep_ctrl:rxm_cmap_free():684&amp;lt;info&amp;gt; Closing cmap
libfabric:7532:ofi_rxm:ep_ctrl:rxm_cmap_cm_thread_close():658&amp;lt;info&amp;gt; stopping CM thread
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (564) not found in wait list - 00000000001779C0
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (580) not found in wait list - 00000000001779C0
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220&amp;lt;info&amp;gt; Given fd (576) not found in wait list - 00000000001779C0&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for your help&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;/P&gt;
&lt;P&gt;Jason&lt;/P&gt;</description>
      <pubDate>Wed, 19 Aug 2020 16:28:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1201730#M7044</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-08-19T16:28:26Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1204635#M7068</link>
      <description>&lt;P&gt;Hi Jason,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for reporting your findings. Which NIC card do you have on your system? If you are using IB cards, how is IPoIB configured (v4/v6/both)?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Many thanks,&lt;/P&gt;&lt;P&gt;Amar&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 27 Aug 2020 12:55:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1204635#M7068</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2020-08-27T12:55:46Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1206396#M7106</link>
      <description>&lt;P&gt;Dear Amar,&lt;/P&gt;
&lt;P&gt;I have the following NIC,&lt;/P&gt;
&lt;LI-CODE lang="none"&gt;description: Ethernet interface
       product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
       vendor: Realtek Semiconductor Co., Ltd.
       physical id: 0
       bus info: pci@0000:03:00.0
       logical name: enp3s0
       version: 0c
       serial: 1c:1b:0d:7c:44:9e
       size: 1Gbit/s
       capacity: 1Gbit/s
       width: 64 bits
       clock: 33MHz
       capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=r8169 driverversion=2.3LK-NAPI duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=10.0.0.6 latency=0 link=yes multicast=yes port=MII speed=1Gbit/s&lt;/LI-CODE&gt;
&lt;P&gt;It has a static IPv4.&lt;/P&gt;
&lt;P&gt;We do not have IB card as we dont run in multiple nodes yet.&lt;/P&gt;
&lt;P&gt;Best Regards,&lt;BR /&gt;Jason&lt;/P&gt;</description>
      <pubDate>Thu, 03 Sep 2020 12:55:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1206396#M7106</guid>
      <dc:creator>Michailpg</dc:creator>
      <dc:date>2020-09-03T12:55:10Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1291082#M8474</link>
      <description>&lt;P&gt;Hi Jason,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Apologies for the radio silence on this thread. I just wanted to let you know that an internal ticket has been raised for this issue with the development team. I shall write to you with more details as they become available.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Amar&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 18 Jun 2021 11:46:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1291082#M8474</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2021-06-18T11:46:03Z</dc:date>
    </item>
    <item>
      <title>Re: Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1298794#M8598</link>
      <description>&lt;P&gt;On our cluster we are testing an upgrade of our Intel MPI Library to Version 2021.2 and we observe something similar as the original post. Specifically for MPI_Alltoallw, the performance is significantly worse than for previous Intel MPI versions. In an attempt to simplify the code, I made a single-core program that performs a matrix transpose by constructing a strided MPI data type that allows to change between row-major and column-major storage. For this case, it is possible to use MPI_Alltoall (or even a simple Fortran transpose), but in our actual code the use of MPI_Alltoallw is required.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Here are the timings (in seconds) for transposing a [512x512x512] array along the first two dimensions on a Intel(R) Xeon(R) Gold 6140:&lt;/P&gt;
&lt;TABLE border="1" width="98.39816475534145%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD height="24px"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD height="24px"&gt;TRANSPOSE&lt;/TD&gt;
&lt;TD height="24px"&gt;ALLTOALL&lt;/TD&gt;
&lt;TD height="24px"&gt;ALLTOALLW&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;Version 2018 Update 5&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.29&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.50&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.30&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;Version 2021.2&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.28&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.49&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;2.12&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A first interesting observation is that ALLTOALL is only significantly slower than TRANSPOSE if the strided MPI Data type is at the receiving side. If the sender has the strided MPI Data type, the difference is only a few percent.&lt;/P&gt;
&lt;P&gt;The more important issue for us is the serious slow down (timing increase of more than 50%) of ALLTOALLW when switching to the new Intel MPI library.&lt;BR /&gt;I added the code to get these numbers in attachment. It can be simply compiled with "mpiifort -O2 -xHost bench.f90" and run with "I_MPI_PIN_PROCESSOR_LIST=0 mpirun -np 1 ./a.out 512". Here is the output when setting I_MPI_DEBUG=12 for the latest version:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2021.2  Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (2084 MB per rank) * (1 local ranks) = 2084 MB total
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): max_ch4_vcis: 1, max_reg_eps 1, enable_sep 0, enable_shared_ctxs 0, do_av_insert 1
[0] MPI startup(): addrnamelen: 1024
[0] MPI startup(): File "/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0/etc/tuning_skx_shm-ofi_mlx.dat" not found
[0] MPI startup(): Load tuning file: "/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0/etc/tuning_skx_shm-ofi.dat"
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       28548    r22i13n16  0
[0] MPI startup(): I_MPI_ROOT=/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_RMK=pbs
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=12&lt;/LI-CODE&gt;</description>
      <pubDate>Thu, 15 Jul 2021 09:47:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1298794#M8598</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2021-07-15T09:47:03Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1299116#M8604</link>
      <description>&lt;P&gt;Hi SVDB,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Although similar, the original issue in this thread relates primarily to Windows and not Linux (like in your case). For efficient tracking, may I request you to kindly open a new thread?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Many thanks,&lt;/P&gt;&lt;P&gt;Amar&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 16 Jul 2021 08:14:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1299116#M8604</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2021-07-16T08:14:41Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1318982#M8797</link>
      <description>&lt;P&gt;Dear community members,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Please be informed that we are working on fixing this issue in a future release of Intel MPI Library. &lt;/P&gt;&lt;P&gt; &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 04 Oct 2021 07:23:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1318982#M8797</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2021-10-04T07:23:18Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1546451#M11180</link>
      <description>&lt;P&gt;Hello again, Michail,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Can you please recheck the performance with the latest version of Intel MPI? The performance has improved significantly versus the older version.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best regards,&lt;/P&gt;&lt;P&gt;Amar&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 22 Nov 2023 12:02:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1546451#M11180</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2023-11-22T12:02:10Z</dc:date>
    </item>
    <item>
      <title>Re:Intel MPI_Alltoallw Poor Performance</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1548412#M11192</link>
      <description>&lt;P&gt;Closing this case due to inactivity. This issue is assumed to be resolved and we will no longer respond to this thread.&amp;nbsp;If you require additional assistance from Intel, please start a new thread.&amp;nbsp;Any further interaction in this thread will be considered community only.&lt;/P&gt;&lt;P&gt; &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 28 Nov 2023 17:30:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Intel-MPI-Alltoallw-Poor-Performance/m-p/1548412#M11192</guid>
      <dc:creator>DrAmarpal_K_Intel</dc:creator>
      <dc:date>2023-11-28T17:30:25Z</dc:date>
    </item>
  </channel>
</rss>

