<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:Per-rank timing heterogeneity in MPI_File_Write_at_all in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1280208#M8258</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please provide the above-requested details (Sample reproducer code)?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Mon, 10 May 2021 11:36:17 GMT</pubDate>
    <dc:creator>SantoshY_Intel</dc:creator>
    <dc:date>2021-05-10T11:36:17Z</dc:date>
    <item>
      <title>Per-rank timing heterogeneity in MPI_File_Write_at_all</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275458#M8132</link>
      <description>Hi,
   I ran a timing test for 16 ranks writing 500 MB each from a single Skylake node to a Lustre PFS. Simple C test code (which I can provide), built with Intel 2020.1.217. What I see is
Rank 1 has diff 1.749272
Rank 8 has diff 1.764557
Rank 11 has diff 1.777109
Rank 4 has diff 1.782356
Rank 6 has diff 1.833384
Rank 15 has diff 1.858618
Rank 0 has diff 3.101054
Rank 10 has diff 4.237715
Rank 12 has diff 4.288582
Rank 7 has diff 4.291451
Rank 3 has diff 4.295812
Rank 14 has diff 4.302241
Rank 5 has diff 4.339086
Rank 13 has diff 5.141735
Rank 9 has diff 5.141858
Rank 2 has diff 5.209390

With another MPI implementation, I see
Rank 0 has diff 7.493546
Rank 10 has diff 7.493548
Rank 13 has diff 7.493549
Rank 14 has diff 7.493545
Rank 15 has diff 7.493545
Rank 1 has diff 7.493545
Rank 2 has diff 7.493544
Rank 3 has diff 7.493544
Rank 4 has diff 7.493544
Rank 5 has diff 7.493546
Rank 6 has diff 7.493545
Rank 7 has diff 7.493546
Rank 8 has diff 7.493552
Rank 9 has diff 7.493545
Rank 11 has diff 7.493548
Rank 12 has diff 7.493545

My question is, why is Intel's timing so heterogeneous? The two implementations clearly are using different algorithms, but Intel looks like it's getting better timings somehow (through buffering, scheduling, ?).

Thanks; Chris</description>
      <pubDate>Wed, 21 Apr 2021 15:48:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275458#M8132</guid>
      <dc:creator>4f0drlp7eyj3</dc:creator>
      <dc:date>2021-04-21T15:48:40Z</dc:date>
    </item>
    <item>
      <title>Re: Per-rank timing heterogeneity in MPI_File_Write_at_all</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275459#M8133</link>
      <description>&lt;P&gt;Sorry, hate this board. Let's try for better formatting.&lt;/P&gt;
&lt;P&gt;Hi, I ran a timing test for 16 ranks writing 500 MB each from a single Skylake node to a Lustre PFS. Simple C test code (which I can provide), built with Intel 2020.1.217. What I see is&lt;/P&gt;
&lt;P&gt;Rank 1 has diff 1.749272&lt;BR /&gt;Rank 8 has diff 1.764557&lt;BR /&gt;Rank 11 has diff 1.777109 &lt;BR /&gt;Rank 4 has diff 1.782356 &lt;BR /&gt;Rank 6 has diff 1.833384 &lt;BR /&gt;Rank 15 has diff 1.858618 &lt;BR /&gt;Rank 0 has diff 3.101054 &lt;BR /&gt;Rank 10 has diff 4.237715 &lt;BR /&gt;Rank 12 has diff 4.288582 &lt;BR /&gt;Rank 7 has diff 4.291451 &lt;BR /&gt;Rank 3 has diff 4.295812 &lt;BR /&gt;Rank 14 has diff 4.302241 &lt;BR /&gt;Rank 5 has diff 4.339086 &lt;BR /&gt;Rank 13 has diff 5.141735 &lt;BR /&gt;Rank 9 has diff 5.141858 &lt;BR /&gt;Rank 2 has diff 5.209390&lt;/P&gt;
&lt;P&gt;With another MPI implementation, I see &lt;BR /&gt;Rank 0 has diff 7.493546 &lt;BR /&gt;Rank 10 has diff 7.493548 &lt;BR /&gt;Rank 13 has diff 7.493549 &lt;BR /&gt;Rank 14 has diff 7.493545 &lt;BR /&gt;Rank 15 has diff 7.493545 &lt;BR /&gt;Rank 1 has diff 7.493545 &lt;BR /&gt;Rank 2 has diff 7.493544 &lt;BR /&gt;Rank 3 has diff 7.493544 &lt;BR /&gt;Rank 4 has diff 7.493544 &lt;BR /&gt;Rank 5 has diff 7.493546 &lt;BR /&gt;Rank 6 has diff 7.493545 &lt;BR /&gt;Rank 7 has diff 7.493546 &lt;BR /&gt;Rank 8 has diff 7.493552 &lt;BR /&gt;Rank 9 has diff 7.493545 &lt;BR /&gt;Rank 11 has diff 7.493548 &lt;BR /&gt;Rank 12 has diff 7.493545&lt;/P&gt;
&lt;P&gt;My question is, why is Intel's timing so heterogeneous? The two implementations clearly are using different algorithms, but Intel looks like it's getting better timings somehow (through buffering, scheduling, ?).&lt;/P&gt;
&lt;P&gt;Thanks; Chris&lt;/P&gt;</description>
      <pubDate>Wed, 21 Apr 2021 15:51:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275459#M8133</guid>
      <dc:creator>4f0drlp7eyj3</dc:creator>
      <dc:date>2021-04-21T15:51:46Z</dc:date>
    </item>
    <item>
      <title>Re: Per-rank timing heterogeneity in MPI_File_Write_at_all</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275819#M8142</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;
&lt;P&gt;&lt;I&gt;&amp;gt;&amp;gt;"&amp;nbsp;I ran a timing test for 16 ranks writing 500 MB each from a single Skylake node to a Lustre PFS. Simple C test code (which I can provide), built with Intel 2020.1.217."&lt;/I&gt;&lt;/P&gt;
&lt;P&gt;--&amp;nbsp;Can you please share a sample reproducer code?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 22 Apr 2021 12:27:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1275819#M8142</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-04-22T12:27:33Z</dc:date>
    </item>
    <item>
      <title>Re:Per-rank timing heterogeneity in MPI_File_Write_at_all</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1280208#M8258</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Could you please provide the above-requested details (Sample reproducer code)?&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 10 May 2021 11:36:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1280208#M8258</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-05-10T11:36:17Z</dc:date>
    </item>
    <item>
      <title>Re:Per-rank timing heterogeneity in MPI_File_Write_at_all</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1292042#M8482</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;As we have worked with you internally and upon request, we are closing this thread.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 22 Jun 2021 11:13:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Per-rank-timing-heterogeneity-in-MPI-File-Write-at-all/m-p/1292042#M8482</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2021-06-22T11:13:53Z</dc:date>
    </item>
  </channel>
</rss>

