<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2 in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1351501#M9099</link>
    <description>&lt;P&gt;Will the 2022.1 version be made available as standalone components? At the moment I cannot see them at &lt;A href="https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html&lt;/A&gt; The oneAPI Base Toolkit offline installer is not working for me (it is stuck on Wait while the installer is preparing...), while with the individual components I usually don't have a problem.&lt;/P&gt;</description>
    <pubDate>Fri, 14 Jan 2022 14:07:40 GMT</pubDate>
    <dc:creator>SVDB</dc:creator>
    <dc:date>2022-01-14T14:07:40Z</dc:date>
    <item>
      <title>MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299117#M8605</link>
      <description>&lt;P class="sub_section_element_selectors"&gt;On our cluster we are testing an upgrade of our Intel MPI Library to Version 2021.2 and we observe something similar as the original post. Specifically for MPI_Alltoallw, the performance is significantly worse than for previous Intel MPI versions. In an attempt to simplify the code, I made a single-core program that performs a matrix transpose by constructing a strided MPI data type that allows to change between row-major and column-major storage. For this case, it is possible to use MPI_Alltoall (or even a simple Fortran transpose), but in our actual code the use of MPI_Alltoallw is required.&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;&lt;BR /&gt;Here are the timings (in seconds) for transposing a [512x512x512] array along the first two dimensions on a Intel(R) Xeon(R) Gold 6140:&lt;/P&gt;
&lt;TABLE border="1" width="98.39816475534145%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD height="24px" class="sub_section_element_selectors"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD height="24px" class="sub_section_element_selectors"&gt;TRANSPOSE&lt;/TD&gt;
&lt;TD height="24px" class="sub_section_element_selectors"&gt;ALLTOALL&lt;/TD&gt;
&lt;TD height="24px" class="sub_section_element_selectors"&gt;ALLTOALLW&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;Version 2018 Update 5&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;1.29&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;1.50&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;1.30&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;Version 2021.2&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;1.28&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;1.49&lt;/TD&gt;
&lt;TD width="25%" height="24px" class="sub_section_element_selectors"&gt;2.12&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P class="sub_section_element_selectors"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;A first interesting observation is that ALLTOALL is only significantly slower than TRANSPOSE if the strided MPI Data type is at the receiving side. If the sender has the strided MPI Data type, the difference is only a few percent.&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;The more important issue for us is the serious slow down (timing increase of more than 50%) of ALLTOALLW when switching to the new Intel MPI library.&lt;BR /&gt;I added the code to get these numbers in attachment. It can be simply compiled with "mpiifort -O2 -xHost bench.f90" and run with "I_MPI_PIN_PROCESSOR_LIST=0 mpirun -np 1 ./a.out 512". Here is the output when setting I_MPI_DEBUG=12 for the latest version:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2021.2  Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (2084 MB per rank) * (1 local ranks) = 2084 MB total
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): max_ch4_vcis: 1, max_reg_eps 1, enable_sep 0, enable_shared_ctxs 0, do_av_insert 1
[0] MPI startup(): addrnamelen: 1024
[0] MPI startup(): File "/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0/etc/tuning_skx_shm-ofi_mlx.dat" not found
[0] MPI startup(): Load tuning file: "/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0/etc/tuning_skx_shm-ofi.dat"
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       28548    r22i13n16  0
[0] MPI startup(): I_MPI_ROOT=/vsc-hard-mounts/leuven-apps/skylake/2021a/software/impi/2021.2.0-intel-compilers-2021.2.0/mpi/2021.2.0
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_RMK=pbs
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=12&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Jul 2021 08:18:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299117#M8605</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2021-07-16T08:18:52Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299684#M8616</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We need more information to investigate your issue. Could you please provide us I_MPI_DEBUG information after running the code&amp;nbsp;using&amp;nbsp;2018 Update 5.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;Shanmukh.SS&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 19 Jul 2021 11:46:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299684#M8616</guid>
      <dc:creator>ShanmukhS_Intel</dc:creator>
      <dc:date>2021-07-19T11:46:03Z</dc:date>
    </item>
    <item>
      <title>Re: Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299687#M8617</link>
      <description>&lt;P&gt;Hello ShanmukS,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is the output from 2018 Update 5 with I_MPI_DEBUG=12:&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;I_MPI_DEBUG=12 I_MPI_PIN_PROCESSOR_LIST=0 mpirun -np 1 ./a.out 512
[0] MPI startup(): Intel(R) MPI Library, Version 2018 Update 5  Build 20190404 (id: 18839)
[0] MPI startup(): Copyright (C) 2003-2019 Intel Corporation.  All rights reserved.
[0] MPI startup(): Multi-threaded optimized library
[0] MPI startup(): shm data transfer mode
[0] MPI startup(): Device_reset_idx=8
[0] MPI startup(): Allgather: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Allgatherv: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Allreduce: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Alltoall: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Alltoallv: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Alltoallw: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Barrier: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Bcast: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Exscan: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Gather: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Gatherv: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Reduce_scatter: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Reduce: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Scan: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Scatter: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Scatterv: 0: 0-2147483647 &amp;amp; 0-2147483647
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       5936     r22i13n01  0
[0] MPI startup(): Recognition=2 Platform(code=512 ippn=0 dev=1) Fabric(intra=1 inter=1 flags=0x0)
[0] MPI startup(): Topology split mode = 1

| rank | node | space=1
|  0  |  0  |
[0] MPI startup(): I_MPI_DEBUG=12
[0] MPI startup(): I_MPI_INFO_BRAND=Intel(R) Xeon(R) Gold 6140
[0] MPI startup(): I_MPI_INFO_CACHE1=0,1,2,3,4,8,9,10,11,16,17,18,19,20,24,25,26,27,32,33,34,35,36,40,41,42,43,48,49,50,51,52,56,57,58,59
[0] MPI startup(): I_MPI_INFO_CACHE2=0,1,2,3,4,8,9,10,11,16,17,18,19,20,24,25,26,27,32,33,34,35,36,40,41,42,43,48,49,50,51,52,56,57,58,59
[0] MPI startup(): I_MPI_INFO_CACHE3=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_CACHES=3
[0] MPI startup(): I_MPI_INFO_CACHE_SHARE=2,2,64
[0] MPI startup(): I_MPI_INFO_CACHE_SIZE=32768,1048576,25952256
[0] MPI startup(): I_MPI_INFO_CORE=0,1,2,3,4,8,9,10,11,16,17,18,19,20,24,25,26,27,0,1,2,3,4,8,9,10,11,16,17,18,19,20,24,25,26,27
[0] MPI startup(): I_MPI_INFO_C_NAME=Unknown
[0] MPI startup(): I_MPI_INFO_DESC=1342177280
[0] MPI startup(): I_MPI_INFO_FLGB=-744488965
[0] MPI startup(): I_MPI_INFO_FLGC=2147417087
[0] MPI startup(): I_MPI_INFO_FLGCEXT=24
[0] MPI startup(): I_MPI_INFO_FLGD=-1075053569
[0] MPI startup(): I_MPI_INFO_FLGDEXT=-1677712384
[0] MPI startup(): I_MPI_INFO_LCPU=36
[0] MPI startup(): I_MPI_INFO_MODE=263
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_MAP=mlx5_0:0,mlx5_1:0
[0] MPI startup(): I_MPI_INFO_NUMA_NODE_NUM=2
[0] MPI startup(): I_MPI_INFO_PACK=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1
[0] MPI startup(): I_MPI_INFO_SIGN=329300
[0] MPI startup(): I_MPI_INFO_STATE=0
[0] MPI startup(): I_MPI_INFO_THREAD=0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
[0] MPI startup(): I_MPI_INFO_VEND=1
[0] MPI startup(): I_MPI_PIN_INFO=0
[0] MPI startup(): I_MPI_PIN_MAPPING=1:0 0
           TRANSPOSE    N   512  FWD    0.645729  BCK    0.644933  TOT    1.290661
            ALLTOALL    N   512  FWD    0.653792  BCK    0.850857  TOT    1.504649
           ALLTOALLW    N   512  FWD    0.657849  BCK    0.657091  TOT    1.314940&lt;/LI-CODE&gt;</description>
      <pubDate>Mon, 19 Jul 2021 12:03:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299687#M8617</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2021-07-19T12:03:49Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299973#M8620</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for providing I_MPI_DEBUG information. But, we didn't find any libfabric details being mentioned by I_MPI_DEBUG in the debug log that you have provided.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please confirm with us your environment and hardware details?&lt;/P&gt;&lt;P&gt;Also provide the interconnect hardware and OFI provider that has been used for both 2018u5 &amp;amp; 2021.2 versions?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please confirm with us whether you are executing the code on the same machine using both 2018 Update 5 and 2021.2 versions?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;Shanmukh.SS&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 20 Jul 2021 11:42:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299973#M8620</guid>
      <dc:creator>ShanmukhS_Intel</dc:creator>
      <dc:date>2021-07-20T11:42:14Z</dc:date>
    </item>
    <item>
      <title>Re: Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299999#M8621</link>
      <description>&lt;P class="sub_section_element_selectors"&gt;&lt;STRONG&gt;Could you please confirm with us your environment and hardware details?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;The operating system is CentOS Linux release 7.9.2009. Tests are performed on a node that has 2 Xeon Gold 6140 CPUs@2.3 GHz (Skylake). Whenever I compare timings, these were obtained on the same machine. &lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;&lt;STRONG&gt;Also provide the interconnect hardware...&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;Nodes are connected using an Infiniband EDR network. I am not sure whether that is relevant for this test which runs on a single core on a single node.&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;&lt;STRONG&gt;...and OFI provider that has been used for both 2018u5 &amp;amp; 2021.2 versions?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;I thought prior to version 2019 the OFA fabric was used instead of OFI? Again I am not sure whether this is relevant as the debug output for 2018u5 indicates: "[0] MPI startup(): shm data transfer mode".&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;The OFI provider for 2021.2 is mlx running over ucx 1.10.0, but thus far I assumed that communication would go through shm for this single-core example (based on the debug output mentioning "shm segment size (2084 MB per rank) * (1 local ranks) = 2084 MB total")&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;&lt;STRONG&gt;Could you please confirm with us whether you are executing the code on the same machine using both 2018 Update 5 and 2021.2 versions?&lt;/STRONG&gt;&lt;/P&gt;
&lt;P class="sub_section_element_selectors"&gt;Yes.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Jul 2021 13:37:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1299999#M8621</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2021-07-20T13:37:49Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1301287#M8632</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for sharing the information.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We are currently working on your issue.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Meanwhile could you please try with latest Intel oneAPI version 2021.3 and let us know whether you are facing the same issues.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;Shanmukh.SS&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 26 Jul 2021 10:36:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1301287#M8632</guid>
      <dc:creator>ShanmukhS_Intel</dc:creator>
      <dc:date>2021-07-26T10:36:26Z</dc:date>
    </item>
    <item>
      <title>Re: Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1302949#M8648</link>
      <description>&lt;P&gt;Hello Shanmukh,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried the same example with oneAPI 2021.3, but there is no significant difference with 2021.2. For consistency, I reran the example with different versions on the same node and these are the timings:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;TABLE border="1" width="100%"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;&amp;nbsp;&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;TRANSPOSE&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;ALLTOALL&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;ALLTOALLW&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;Version 2018 Update 5&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.27&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.48&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.28&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;Version 2021.2&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.27&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.48&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;2.10&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="25%" height="24px"&gt;Version 2021.3&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.27&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;1.48&lt;/TD&gt;
&lt;TD width="25%" height="24px"&gt;2.09&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is the verbose output when using oneAPI version 2021.3&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;[0] MPI startup(): Intel(R) MPI Library, Version 2021.3  Build 20210601 (id: 6f90181f1)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): shm segment size (2084 MB per rank) * (1 local ranks) = 2084 MB total
[0] MPI startup(): libfabric version: 1.12.1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): max_ch4_vcis: 1, max_reg_eps 1, enable_sep 0, enable_shared_ctxs 0, do_av_insert 1
[0] MPI startup(): addrnamelen: 1024
[0] MPI startup(): File "/vsc-hard-mounts/leuven-data/337/vsc33716/Software/myapps/genius/skylake/2021a/software/impi/2021.3.0-intel-compilers-2021.3.0/mpi/2021.3.0/etc/tuning_skx_shm-ofi_mlx.dat" not found
[0] MPI startup(): Load tuning file: "/vsc-hard-mounts/leuven-data/337/vsc33716/Software/myapps/genius/skylake/2021a/software/impi/2021.3.0-intel-compilers-2021.3.0/mpi/2021.3.0/etc/tuning_skx_shm-ofi.dat"
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       21923    r23i13n21  0
[0] MPI startup(): I_MPI_ROOT=/vsc-hard-mounts/leuven-data/337/vsc33716/Software/myapps/genius/skylake/2021a/software/impi/2021.3.0-intel-compilers-2021.3.0/mpi/2021.3.0
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_RMK=pbs
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=0
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=12&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Aug 2021 09:15:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1302949#M8648</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2021-08-02T09:15:36Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1303596#M8650</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks for sharing us the required details.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We reproduced the issue at our end. We are working on your issue internally. We will get back to you soon.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best Regards,&lt;/P&gt;&lt;P&gt;Shanmukh.SS&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 04 Aug 2021 11:19:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1303596#M8650</guid>
      <dc:creator>ShanmukhS_Intel</dc:creator>
      <dc:date>2021-08-04T11:19:21Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1311018#M8708</link>
      <description>&lt;P&gt;This is a known issue, and your regression report should help the developers fix it. &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 31 Aug 2021 04:33:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1311018#M8708</guid>
      <dc:creator>Jennifer_D_Intel</dc:creator>
      <dc:date>2021-08-31T04:33:41Z</dc:date>
    </item>
    <item>
      <title>Re: MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1351501#M9099</link>
      <description>&lt;P&gt;Will the 2022.1 version be made available as standalone components? At the moment I cannot see them at &lt;A href="https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html" target="_blank"&gt;https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html&lt;/A&gt; The oneAPI Base Toolkit offline installer is not working for me (it is stuck on Wait while the installer is preparing...), while with the individual components I usually don't have a problem.&lt;/P&gt;</description>
      <pubDate>Fri, 14 Jan 2022 14:07:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1351501#M9099</guid>
      <dc:creator>SVDB</dc:creator>
      <dc:date>2022-01-14T14:07:40Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1367322#M9276</link>
      <description>&lt;P&gt;Hi Steven,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Yes, Intel® MPI Library (version 2021.5) is available as a standard alone component. It is available in the link you posted earlier and also included in the Intel® oneAPI HPC Toolkit (version 2022.1). In addition, the required changes for the reported regression of MPI_Alltoallw will not be ready to be included in upcoming Intel® MPI Library release of version 2021.6. Thank you very much for your patience.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Best,&lt;/P&gt;&lt;P&gt;Xiao&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 09 Mar 2022 22:12:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1367322#M9276</guid>
      <dc:creator>Xiao_Z_Intel</dc:creator>
      <dc:date>2022-03-09T22:12:55Z</dc:date>
    </item>
    <item>
      <title>Re:MPI_Alltoallw performs poorly with Intel MPI Library to Version 2021.2</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1371474#M9348</link>
      <description>&lt;P&gt;Hi Steven,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN class="ql-cursor"&gt;﻿&lt;/SPAN&gt;Please refer to the Intel® MPI Library Release Notes for the fix of the reported regression (&lt;A href="https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html" rel="noopener noreferrer" target="_blank"&gt;https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html&lt;/A&gt;). I also addressed your question of the availability of standalone Intel® MPI Library. &amp;nbsp;We will no longer respond to this thread.&amp;nbsp; If you require additional assistance from Intel, please start a new thread.&amp;nbsp; Any further interaction in this thread will be considered community only.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Best,&lt;/P&gt;&lt;P&gt;Xiao&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 23 Mar 2022 20:20:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/MPI-Alltoallw-performs-poorly-with-Intel-MPI-Library-to-Version/m-p/1371474#M9348</guid>
      <dc:creator>Xiao_Z_Intel</dc:creator>
      <dc:date>2022-03-23T20:20:12Z</dc:date>
    </item>
  </channel>
</rss>

