<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Measuring theoretical flops for icelake in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-theoretical-flops-for-icelake/m-p/1351285#M8006</link>
    <description>&lt;P&gt;The biggest problem with computing "peak" performance for recent processors is knowing what value to use for the frequency.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The nominal frequency on the Xeon Platinum 8358 is 2.6 GHz. &amp;nbsp;When running AVX512 code (required to get 32 FLOPS/cycle/core), the base frequency is 1.9 GHz and the maximum all-core Turbo frequency is 2.9 GHz. &amp;nbsp;The actual frequency seen when running a "peak FLOPS" sort of benchmark will depend on the leakage current of the particular chip and the effectiveness of the cooling system.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;32 FLOPs/cycle/core * 32 cores * 1.9 GHz * 2 sockets = 3891.2 GFLOPS&lt;/P&gt;
&lt;P&gt;32 FLOPS/cycle/core * 32 cores * 2.9 GHz * 2 sockets = 5939.2 GFLOPS&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Based on experience with SKX and CLX processors, I expect you would see average frequencies for the HPL benchmark in the range of 2.2 GHz to 2.6 GHz across an ensemble of systems with this configuration. &amp;nbsp;HPL performance will be in the neighborhood of 90% of peak based on the actual average frequency during the run. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The Xeon Gold 6338 will use the same procedure. &amp;nbsp;With SKX and CLX you had to check to see whether the processor had 1 or 2 AVX512 FMA units. &amp;nbsp;It looks like for ICX all of the processors have 2, so 32 FLOPS/cycle/core should work for all models.&lt;/P&gt;</description>
    <pubDate>Thu, 13 Jan 2022 21:12:04 GMT</pubDate>
    <dc:creator>McCalpinJohn</dc:creator>
    <dc:date>2022-01-13T21:12:04Z</dc:date>
    <item>
      <title>Measuring theoretical flops for icelake</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-theoretical-flops-for-icelake/m-p/1351110#M8004</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I have few servers each equipped with with dual icelake 8358&amp;nbsp;processors.&lt;BR /&gt;I would like to know that the following is correct method to measure theoretical Double Precision flops&amp;nbsp; (RMax) -&lt;/P&gt;
&lt;P&gt;=&amp;nbsp; cores/socket * sockets * frequency * ops/cycle * elements/ops * vector registers per core&lt;/P&gt;
&lt;P&gt;= 32 * 2 * 2.6 * 2 * ( 512 register size / 64 bits DP ) * 2&lt;/P&gt;
&lt;P&gt;=&amp;nbsp;32 * 2 * 2.6 * 2 * 8 * 2&lt;/P&gt;
&lt;P&gt;= 2662.4 * 2&lt;BR /&gt;= 5324.8&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Also, with there be any difference apart from frequency and cores/socket if i try to calculate FLOPS for 6338 CPU Model?&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 12:11:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-theoretical-flops-for-icelake/m-p/1351110#M8004</guid>
      <dc:creator>psing51</dc:creator>
      <dc:date>2022-01-13T12:11:03Z</dc:date>
    </item>
    <item>
      <title>Re: Measuring theoretical flops for icelake</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-theoretical-flops-for-icelake/m-p/1351285#M8006</link>
      <description>&lt;P&gt;The biggest problem with computing "peak" performance for recent processors is knowing what value to use for the frequency.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The nominal frequency on the Xeon Platinum 8358 is 2.6 GHz. &amp;nbsp;When running AVX512 code (required to get 32 FLOPS/cycle/core), the base frequency is 1.9 GHz and the maximum all-core Turbo frequency is 2.9 GHz. &amp;nbsp;The actual frequency seen when running a "peak FLOPS" sort of benchmark will depend on the leakage current of the particular chip and the effectiveness of the cooling system.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;32 FLOPs/cycle/core * 32 cores * 1.9 GHz * 2 sockets = 3891.2 GFLOPS&lt;/P&gt;
&lt;P&gt;32 FLOPS/cycle/core * 32 cores * 2.9 GHz * 2 sockets = 5939.2 GFLOPS&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Based on experience with SKX and CLX processors, I expect you would see average frequencies for the HPL benchmark in the range of 2.2 GHz to 2.6 GHz across an ensemble of systems with this configuration. &amp;nbsp;HPL performance will be in the neighborhood of 90% of peak based on the actual average frequency during the run. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The Xeon Gold 6338 will use the same procedure. &amp;nbsp;With SKX and CLX you had to check to see whether the processor had 1 or 2 AVX512 FMA units. &amp;nbsp;It looks like for ICX all of the processors have 2, so 32 FLOPS/cycle/core should work for all models.&lt;/P&gt;</description>
      <pubDate>Thu, 13 Jan 2022 21:12:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-theoretical-flops-for-icelake/m-p/1351285#M8006</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2022-01-13T21:12:04Z</dc:date>
    </item>
  </channel>
</rss>

