<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Stream Benchmark in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1730399#M8590</link>
    <description>&lt;P&gt;I have a problem to achieve 60%+ of hardware peak for intel Xeon Platinum 8558. Do you have any idea how to achieve 80% of 44 instead of given 20!&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 16 Dec 2025 12:18:32 GMT</pubDate>
    <dc:creator>VictorSG</dc:creator>
    <dc:date>2025-12-16T12:18:32Z</dc:date>
    <item>
      <title>Stream Benchmark</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029736#M4245</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I measured a stream benchmark on the login node of our cluster today. The node has a&amp;nbsp;&lt;SPAN style="color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif; font-size: 11px; line-height: 17.6px;"&gt;Intel&lt;/SPAN&gt;&lt;SUP style="line-height: 0; font-size: 8.91px; color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif;"&gt;®&lt;/SUP&gt;&lt;SPAN style="color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif; font-size: 11px; line-height: 17.6px;"&gt;&amp;nbsp;Xeon&lt;/SPAN&gt;&lt;SUP style="line-height: 0; font-size: 8.91px; color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif;"&gt;®&lt;/SUP&gt;&lt;SPAN style="color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif; font-size: 11px; line-height: 17.6px;"&gt;&amp;nbsp;Processor E5-4650 with 4 x 8 cores.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif; font-size: 11px; line-height: 17.6px;"&gt;I measure multiple configurations shown in the two pictures attached. When I set the &lt;/SPAN&gt;&lt;SPAN class="s1"&gt;STREAM_ARRAY_SIZE&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="color: rgb(51, 51, 51); font-family: Tahoma, Arial, Helvetica, sans-serif; font-size: 11px; line-height: 17.6px;"&gt;to 40e6 and tune&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="s1"&gt;OMP_NUM_THREADS &lt;FONT color="#333333" face="Tahoma, Arial, Helvetica, sans-serif"&gt;&lt;SPAN style="font-size: 11px; line-height: 17.6px;"&gt;from 1 to 64, I get very noisy results. I expected this due to NUMA issues. But I measured the same configuration 6 times and I get completely different results here (shown in the second picture).&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="s1"&gt;&lt;FONT color="#333333" face="Tahoma, Arial, Helvetica, sans-serif"&gt;&lt;SPAN style="font-size: 11px; line-height: 17.6px;"&gt;Then I changed the array size per thread in that way that I tuned&amp;nbsp;&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;STREAM_ARRAY_SIZE (compile time) and&amp;nbsp;OMP_NUM_THREADS at the same time. Comparing this to having the array size fixed gives me different results.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;What is the right way of doing the measurements when sweeping the number threads used? Why is the difference between measurements of the same configuration so big. Results of a benchmark should be &lt;/SPAN&gt;reproducible&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt; (I guess I do something wrong).&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;regards&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;- Grischa&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 16 Sep 2015 15:54:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029736#M4245</guid>
      <dc:creator>Grischa_J_</dc:creator>
      <dc:date>2015-09-16T15:54:44Z</dc:date>
    </item>
    <item>
      <title>You should be reading John</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029737#M4246</link>
      <description>&lt;P&gt;You should be reading John McCalpin's advice first, as there is too much useful advice to repeat here.&lt;/P&gt;

&lt;P&gt;Non-repeatable results are likely when you don't set affinity, e.g. OMP_PLACES=cores (when not over-subscribing), plus a setting to divide threads evenly among CPUs (if that is what you want).&amp;nbsp; You will expect results to degrade when over-subscribing (HyperThreads don't help bandwidth).&lt;/P&gt;</description>
      <pubDate>Wed, 16 Sep 2015 16:28:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029737#M4246</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2015-09-16T16:28:00Z</dc:date>
    </item>
    <item>
      <title>Given the capabilities of the</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029738#M4247</link>
      <description>&lt;P&gt;Given the capabilities of the Linux OS and the capabilities of OpenMP, getting repeatable STREAM results requires a bit of extra work.&amp;nbsp;&amp;nbsp; This is frustrating, but once you get used to forcing thread binding the reward of repeatable performance makes it worthwhile.&lt;/P&gt;

&lt;P&gt;I published results on a similar system in January of 2013 -- the STREAM submission contains extra instructions on exactly how to compile and run the job:&amp;nbsp; &lt;A href="http://www.cs.virginia.edu/stream/stream_mail/2013/0000.html" target="_blank"&gt;http://www.cs.virginia.edu/stream/stream_mail/2013/0000.html&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Repeating the instructions here -- pay extra attention to the lines in &lt;STRONG&gt;bold&lt;/STRONG&gt;.&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;System: Dell PowerEdge 820 --- one of the "large memory" nodes in the TACC Stampede system (c400-101 in the current configuration)&lt;/LI&gt;
	&lt;LI&gt;Processors: 4 Intel Xeon E5-4650 (2.70 GHz)&lt;/LI&gt;
	&lt;LI&gt;Memory: 1024 GB DDR3/1333 (32 DIMMs of 32 GB each)&lt;/LI&gt;
	&lt;LI&gt;O/S: RHEL6.3 (2.6.32-279.el6.x86_64)&lt;/LI&gt;
	&lt;LI&gt;Compiler: Intel icc (ICC) 13.0.1 20121010&lt;/LI&gt;
	&lt;LI&gt;Compile Flags: &lt;STRONG&gt;-xAVX -O3 -ffreestanding -openmp -mcmodel=medium -DVERBOSE -DSTREAM_TYPE=double -DSTREAM_ARRAY_SIZE=30000000000&lt;/STRONG&gt;&lt;/LI&gt;
	&lt;LI&gt;Runtime environment: &lt;STRONG&gt;KMP_AFFINITY=compact, OMP_NUM_THREADS=32&lt;/STRONG&gt;&lt;/LI&gt;
	&lt;LI&gt;Execution: &lt;STRONG&gt;numactl –l ./stream.snb_O3_freestanding_double.30000M&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Comments:&lt;BR /&gt;
	1. This used the new version of stream.c (revision 5.10)&lt;BR /&gt;
	2. Results were nearly identical for all suitably large array sizes (100 million to 30 billion elements per array)&lt;BR /&gt;
	3. The compiler flag –ffreestanding prevents the compiler from replacing the STREAM Copy kernel with a call to a library routine.&lt;BR /&gt;
	4. On the Xeon E5-4650, the use of streaming stores does not change STREAM performance — results were identical when compiled with -opt-streaming-stores never&lt;BR /&gt;
	5. Reported bandwidth was essentially identical when compiled for 32-bit arrays with -DSTREAM_TYPE=float.&lt;BR /&gt;
	6. &lt;STRONG&gt;IMPORTANT: If HyperThreading is enabled, switch the first KMP_AFFINITY value from "compact" to "scatter".&lt;/STRONG&gt;&lt;BR /&gt;
	7. The array size used in these results does not meet the minimum size for the STREAM run rules, so please continue to use at least 40M for this system.&amp;nbsp; I will show below that it does not make any difference on this particular platform, so I did not feel any need to update the published results.&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Assuming that your system's memory is configured for 1333 MHz operation (which is the fastest available for the memory configuration on my system), you should get results very similar to the ones I published:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;-------------------------------------------------------------&lt;BR /&gt;
		Function&amp;nbsp;&amp;nbsp;&amp;nbsp; Best Rate MB/s&amp;nbsp; Avg time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Min time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Max time&lt;BR /&gt;
		Copy:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 75539.6505&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3555&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3543&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3566&lt;BR /&gt;
		Scale:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 75749.1105&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3383&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3367&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3405&lt;BR /&gt;
		Add:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 83372.5882&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6375&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6359&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6389&lt;BR /&gt;
		Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 83381.4416&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6372&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6350&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8.6384&lt;BR /&gt;
		-------------------------------------------------------------&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;The array size here (30 million) is a bit smaller than the minimum called for by the STREAM run rules, but later tests showed that there is no significant change for larger sizes, so I did not bother to update the submission.&amp;nbsp;&amp;nbsp; Here is an example of the variation in performance as I change the array size from 1/2 of the minimum required size to 250 times the minimum required size:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Function&amp;nbsp;&amp;nbsp;&amp;nbsp; Best Rate MB/s&amp;nbsp; Avg time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Min time&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Max time&lt;/P&gt;

	&lt;P&gt;20M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 84598.1141&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0057&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0057&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0057&lt;BR /&gt;
		24M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 84932.9972&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0068&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0068&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0068&lt;/P&gt;

	&lt;P&gt;30M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85216.4027&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0085&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0084&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0085&lt;/P&gt;

	&lt;P&gt;100M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85607.5042&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0281&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0280&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0282&lt;/P&gt;

	&lt;P&gt;200M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85708.0912&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0560&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0560&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.0561&lt;/P&gt;

	&lt;P&gt;400M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85795.5676&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1120&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1119&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1120&lt;/P&gt;

	&lt;P&gt;600M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85756.6469&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1680&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1679&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.1682&lt;/P&gt;

	&lt;P&gt;800M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85774.0929&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2240&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2238&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2240&lt;/P&gt;

	&lt;P&gt;1000M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85743.3525&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2800&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2799&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.2802&lt;/P&gt;

	&lt;P&gt;2000M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85760.2268&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.5599&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.5597&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.5601&lt;/P&gt;

	&lt;P&gt;4000M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85771.4984&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.1198&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.1193&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.1229&lt;/P&gt;

	&lt;P&gt;6000M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85756.7564&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.6795&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.6792&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.6802&lt;/P&gt;

	&lt;P&gt;10000M:Triad:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 85755.2369&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.7992&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.7987&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.7997&lt;/P&gt;

	&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;</description>
      <pubDate>Wed, 16 Sep 2015 19:18:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029738#M4247</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2015-09-16T19:18:51Z</dc:date>
    </item>
    <item>
      <title>Can someone point me to</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029739#M4248</link>
      <description>&lt;P&gt;Can someone point me to "Stream v0.15.20190814" ?&lt;/P&gt;</description>
      <pubDate>Thu, 03 Oct 2019 16:54:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029739#M4248</guid>
      <dc:creator>Mauricio_M_Intel</dc:creator>
      <dc:date>2019-10-03T16:54:33Z</dc:date>
    </item>
    <item>
      <title>Ummm..... Never heard of it..</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029740#M4249</link>
      <description>&lt;P&gt;Ummm..... Never heard of it.....&amp;nbsp;&amp;nbsp; Maybe an internal engineering version?&lt;/P&gt;&lt;P&gt;The official source distribution is still &lt;A href="http://www.cs.virginia.edu/stream/FTP/Code/stream.c" target="_blank"&gt;http://www.cs.virginia.edu/stream/FTP/Code/stream.c&lt;/A&gt; (revision 5.10).&amp;nbsp;&amp;nbsp; The next revision will be nearly identical -- I have added one extra OpenMP pragma to parallelize the results checking in checkSTREAMresults() and have increased the default array size from 10 million elements to 80 million elements.&amp;nbsp; I am still undecided about adding more "#ifdef" blocks to enable new OpenMP features (like processor binding) that only exist in OpenMP version 4 and later compilers/runtimes.&lt;/P&gt;</description>
      <pubDate>Thu, 03 Oct 2019 22:01:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1029740#M4249</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2019-10-03T22:01:14Z</dc:date>
    </item>
    <item>
      <title>Re: Stream Benchmark</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1730399#M8590</link>
      <description>&lt;P&gt;I have a problem to achieve 60%+ of hardware peak for intel Xeon Platinum 8558. Do you have any idea how to achieve 80% of 44 instead of given 20!&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 Dec 2025 12:18:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Stream-Benchmark/m-p/1730399#M8590</guid>
      <dc:creator>VictorSG</dc:creator>
      <dc:date>2025-12-16T12:18:32Z</dc:date>
    </item>
  </channel>
</rss>

