<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic &amp;gt;&amp;gt;&amp;gt;Yeah, many utilities raise in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808356#M827</link>
    <description>&amp;gt;&amp;gt;&amp;gt;Yeah, many utilities raise the priority of the threads doing the memory bw test.
Also, if you are doing a single-threaded memory bw test, it can matter on which cpu you are running.
On some operating systems, cpu 0 for example, can be pretty busy and the mem bw thread won't get as much time as on other cpus.&amp;gt;&amp;gt;&amp;gt;

If you are on NUMA system the memory testing gets even more complicated , because of NUMA distances beign involved in memory accesses.</description>
    <pubDate>Mon, 31 Dec 2012 20:02:06 GMT</pubDate>
    <dc:creator>Bernard</dc:creator>
    <dc:date>2012-12-31T20:02:06Z</dc:date>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808346#M817</link>
      <description>&lt;P&gt;Measuring a &lt;STRONG&gt;Memory Bandwidth of a System&lt;/STRONG&gt; ( &lt;STRONG&gt;MBS&lt;/STRONG&gt; ) is a tricky task. In my test I wanted to prove that&lt;BR /&gt;&lt;STRONG&gt;MBS&lt;/STRONG&gt; depends on a priority of an application that measures it.&lt;/P&gt;&lt;P&gt;In order to measure &lt;STRONG&gt;MBS&lt;/STRONG&gt; I used a modified Test-Case provided by &lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Patrick Fay (Intel)&lt;/SPAN&gt;&lt;/STRONG&gt; in a thread:&lt;/P&gt;&lt;P&gt; &lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=102690&amp;amp;o=a&amp;amp;s=lr"&gt;http://software.intel.com/en-us/forums/showthread.php?t=102690&amp;amp;o=a&amp;amp;s=lr&lt;/A&gt;&lt;/P&gt;&lt;P&gt;from a &lt;STRONG&gt;Post #10&lt;/STRONG&gt;. Please take a look at my data:&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Process Priority IDLE ( PPI )&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1647.523 MB/sec 1.609 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1865.626 MB/sec 1.822 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1868.982 MB/sec 1.825 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1868.982 MB/sec 1.825 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 1868.982 MB/sec 1.825 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 1868.982 MB/sec 1.825 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPI&lt;/STRONG&gt;: [ &lt;STRONG&gt;1879.048&lt;/STRONG&gt; MB/sec &lt;STRONG&gt;1.835&lt;/STRONG&gt; GB/sec ]&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Process Priority NORMAL ( PPN )&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1858.916 MB/sec 1.815 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1865.626 MB/sec 1.822 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1865.626 MB/sec 1.822 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1865.626 MB/sec 1.822 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 1868.982 MB/sec 1.825 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 1872.337 MB/sec 1.828 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPN&lt;/STRONG&gt;: [ &lt;STRONG&gt;1879.048&lt;/STRONG&gt; MB/sec &lt;STRONG&gt;1.835&lt;/STRONG&gt; GB/sec ]&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Process Priority HIGH ( PPH )&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1875.693 MB/sec 1.832 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 1879.048 MB/sec 1.835 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 1885.759 MB/sec 1.842 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 1882.404 MB/sec 1.838 GB/sec ] Array size: 32 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPH&lt;/STRONG&gt;: [ &lt;STRONG&gt;1885.759&lt;/STRONG&gt; MB/sec &lt;STRONG&gt;1.842&lt;/STRONG&gt; GB/sec ]&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Process Priority REALTIME ( PPR )&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1885.759 MB/sec 1.842 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 1885.759 MB/sec 1.842 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 1885.759 MB/sec 1.842 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPR&lt;/STRONG&gt;: [ &lt;STRONG&gt;1889.115&lt;/STRONG&gt; MB/sec &lt;STRONG&gt;1.845&lt;/STRONG&gt; GB/sec ]&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Summary&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPR&lt;/STRONG&gt;: [ 1889.115 MB/sec 1.845 GB/sec ] ( 100.00% )&lt;BR /&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPH&lt;/STRONG&gt;: [ 1885.759 MB/sec 1.842 GB/sec ] ( 99.82% )&lt;BR /&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPN&lt;/STRONG&gt;: [ 1879.048 MB/sec 1.835 GB/sec ] ( 99.47% )&lt;BR /&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPI &lt;/STRONG&gt;: [ 1879.048 MB/sec 1.835 GB/sec ] (99.47% )&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Final MBS value&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;MBS&lt;/STRONG&gt;: &lt;STRONG&gt;1889.115&lt;/STRONG&gt; MB/sec ( &lt;STRONG&gt;1.845&lt;/STRONG&gt; GB/sec ) when a process priority was &lt;STRONG&gt;Realtime&lt;/STRONG&gt;.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2012 19:30:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808346#M817</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-16T19:30:06Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808347#M818</link>
      <description>Hello Sergey,&lt;BR /&gt;Yeah, many utilities raise the priority of the threads doing the memory bw test.&lt;BR /&gt;Also, if you are doing a single-threaded memory bw test, it can matter on which cpu you are running.&lt;BR /&gt;On some operating systems, cpu 0 for example, can be pretty busy and the mem bw thread won't get as much time as on other cpus.&lt;BR /&gt;For some of my utilities I also report the user cpu time for the thread and compare that to the elapsed time.&lt;BR /&gt;Usually this will account for the difference you see above without having to mess with priorities.&lt;BR /&gt;Although you are only seeing a difference of ~0.54% max PPR compared to PPI, I have seen more variation (up to a few percent IIRC).&lt;BR /&gt;Pat</description>
      <pubDate>Thu, 16 Feb 2012 20:02:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808347#M818</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2012-02-16T20:02:24Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808348#M819</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Hi Patrick,&lt;BR /&gt;&lt;BR /&gt;Quoting &lt;A jquery1329434850125="58" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=335837" href="https://community.intel.com/en-us/profile/335837/" class="basic"&gt;Patrick Fay (Intel)&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;...&lt;BR /&gt;&lt;EM&gt;Yeah, many &lt;SPAN style="text-decoration: underline;"&gt;utilities raise the priority of the threads doing the memory bw test&lt;/SPAN&gt;.&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt; [&lt;STRONG&gt;SergeyK&lt;/STRONG&gt;] Thanks for the information. I didn't know this.&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;Also, if you are doing a &lt;SPAN style="text-decoration: underline;"&gt;single-threaded memory&lt;/SPAN&gt; bw test, it can matter on which cpu you are running.&lt;BR /&gt;On some operating systems, cpu 0 for example, can be pretty busy and the mem bw thread won't get as much time as on other cpus.&lt;BR /&gt;&lt;BR /&gt;&lt;/EM&gt; [&lt;STRONG&gt;SergeyK&lt;/STRONG&gt;] So far I've done tests on a single CPU computer with an application that has just one thread.&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;...&lt;/EM&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;I've prepared a couple of more data. Please take a look as soon as you have time. I appreciate your comments.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2012 23:58:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808348#M819</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-16T23:58:11Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808349#M820</link>
      <description>&lt;P&gt;I understood that there are many "variables" that affect &lt;STRONG&gt;MBS&lt;/STRONG&gt; value and a size of a test &lt;STRONG&gt;Array&lt;/STRONG&gt; is&lt;BR /&gt;one of them.&lt;BR /&gt;&lt;BR /&gt;Pleasetake a look at how &lt;STRONG&gt;MBS&lt;/STRONG&gt; value changes when a size of the test &lt;STRONG&gt;Array&lt;/STRONG&gt; &lt;SPAN style="text-decoration: underline;"&gt;increases&lt;/SPAN&gt;:&lt;BR /&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;Process Priority REALTIME ( PPR ) - Array size  &lt;SPAN style="text-decoration: underline;"&gt;32 MB&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;BR /&gt; ...&lt;BR /&gt; Test 02: Memory Bandwidth [ 1889.115 MB/sec 1.845 GB/sec ] Array size: 32 MB&lt;BR /&gt; ...&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size  &lt;SPAN style="text-decoration: underline;"&gt;64 MB&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;BR /&gt; ...&lt;BR /&gt; Test 02: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 64 MB&lt;BR /&gt; ...&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size  &lt;SPAN style="text-decoration: underline;"&gt;128 MB&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 128 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 128 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 128 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 128 MB&lt;BR /&gt; ...&lt;BR /&gt; Test 16: Memory Bandwidth [ 1892.470 MB/sec 1.848 GB/sec ] Array size: 128 MB&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size  &lt;SPAN style="text-decoration: underline;"&gt;256 MB&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1905.892 MB/sec 1.861 GB/sec ] Array size: 256 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1905.892 MB/sec 1.861 GB/sec ] Array size: 256 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1905.892 MB/sec 1.861 GB/sec ] Array size: 256 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1905.892 MB/sec 1.861 GB/sec ] Array size: 256 MB&lt;BR /&gt; ...&lt;BR /&gt; Test 16: Memory Bandwidth [ 1905.892 MB/sec 1.861 GB/sec ] Array size: 256 MB&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size &lt;SPAN style="text-decoration: underline;"&gt;512 MB&lt;/SPAN&gt;:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 1932.735 MB/sec 1.887 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1932.735 MB/sec 1.887 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1932.735 MB/sec 1.887 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1932.735 MB/sec 1.887 GB/sec ] Array size: 512 MB&lt;BR /&gt; ...&lt;BR /&gt; Test 16: Memory Bandwidth [ 1932.735 MB/sec 1.887 GB/sec ] Array size: 512 MB&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size &lt;SPAN style="text-decoration: underline;"&gt;1024 MB ( 1GB )&lt;/SPAN&gt;:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 3.821 MB/sec 0.004 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 5.187 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 4.793 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 8.948 MB/sec 0.009 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 9.419 MB/sec 0.009 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 5.711 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 6.279 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 5.289 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 5.238 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 5.423 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 7.206 MB/sec 0.007 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 7.838 MB/sec 0.008 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 7.018 MB/sec 0.007 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 6.628 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 6.839 MB/sec 0.007 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 6.628 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Note&lt;/STRONG&gt;: Performance is significantly affected. A &lt;STRONG&gt;Virtual Memory&lt;/STRONG&gt; ( &lt;STRONG&gt;VM&lt;/STRONG&gt; ) file was used and&lt;BR /&gt; a &lt;STRONG&gt;VM&lt;/STRONG&gt; manager is preempted most of the time.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Summary&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPR&lt;/STRONG&gt;: &lt;SPAN style="text-decoration: underline;"&gt;1932.735&lt;/SPAN&gt; MB/sec ( &lt;SPAN style="text-decoration: underline;"&gt;1.887&lt;/SPAN&gt; GB/sec ) - &lt;STRONG&gt;Array size of &lt;SPAN style="text-decoration: underline;"&gt;512&lt;/SPAN&gt; MB&lt;/STRONG&gt; ( 100.00 % )&lt;BR /&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPR&lt;/STRONG&gt;: 1905.892 MB/sec ( 1.861 GB/sec ) - &lt;STRONG&gt;Array size of 256 MB&lt;/STRONG&gt; ( 98.61 % )&lt;BR /&gt; Max &lt;STRONG&gt;MBS&lt;/STRONG&gt; for &lt;STRONG&gt;PPR&lt;/STRONG&gt;: 1892.470 MB/sec ( 1.848 GB/sec ) - &lt;STRONG&gt;Array size of 128 MB&lt;/STRONG&gt; ( 97.92 % )&lt;/P&gt;&lt;P&gt; These &lt;STRONG&gt;MBS&lt;/STRONG&gt; values are absolutely reproducibleon mysystem for a &lt;STRONG&gt;Debug&lt;/STRONG&gt; and &lt;STRONG&gt;Release&lt;/STRONG&gt; configurations.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Final MBS value&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;1932.735&lt;/STRONG&gt; MB/sec ( &lt;STRONG&gt;1.887&lt;/STRONG&gt; GB/sec ) for &lt;STRONG&gt;Array size of 512 MB&lt;/STRONG&gt; ( a process prioritywas&lt;STRONG&gt;Realtime&lt;/STRONG&gt; )&lt;/P&gt;</description>
      <pubDate>Thu, 16 Feb 2012 23:58:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808349#M820</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-16T23:58:44Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808350#M821</link>
      <description>&lt;P&gt;A Test-Case when a size of the &lt;STRONG&gt;Array&lt;/STRONG&gt; is &lt;STRONG&gt;1024&lt;/STRONG&gt; MB ( &lt;STRONG&gt;1&lt;/STRONG&gt; GB ) is the most inaccurate becausethere are&lt;STRONG&gt;I/O&lt;/STRONG&gt;&lt;BR /&gt;operationswith a drive and a &lt;STRONG&gt;Virtual Memory&lt;/STRONG&gt; manager is preempted most of the time. There is some&lt;BR /&gt;improvement when the process priority is changedto &lt;STRONG&gt;NORMAL&lt;/STRONG&gt; from &lt;STRONG&gt;REALTIME&lt;/STRONG&gt;:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size 1024 MB ( 1GB )&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ 3.821 MB/sec 0.004 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 5.187 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 4.793 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 8.948 MB/sec 0.009 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ &lt;SPAN style="text-decoration: underline;"&gt;9.419&lt;/SPAN&gt; MB/sec &lt;SPAN style="text-decoration: underline;"&gt;0.009&lt;/SPAN&gt; GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 5.711 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 6.279 MB/sec 0.006 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 5.289 MB/sec 0.005 GB/sec ] Array size: 1024 MB&lt;BR /&gt; ...&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPR&lt;/STRONG&gt;: [ 9.419 MB/sec 0.009 GB/sec ] - Array size: 1024 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Note&lt;/STRONG&gt;: Performance is significantly affected. &lt;STRONG&gt;MBS&lt;/STRONG&gt; value is inaccurate.&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;Process Priority NORMAL ( PPN ) - Array size 1024 MB ( 1GB )&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; Test 01: Memory Bandwidth [ &lt;SPAN style="text-decoration: underline;"&gt;46.684&lt;/SPAN&gt; MB/sec &lt;SPAN style="text-decoration: underline;"&gt;0.046&lt;/SPAN&gt; GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 27.532 MB/sec 0.027 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 31.581 MB/sec 0.031 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 28.256 MB/sec 0.028 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 30.678 MB/sec 0.030 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 26.189 MB/sec 0.026 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 27.532 MB/sec 0.027 GB/sec ] Array size: 1024 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 24.971 MB/sec 0.024 GB/sec ] Array size: 1024 MB&lt;BR /&gt; ...&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPN&lt;/STRONG&gt;: [ 46.684 MB/sec 0.046 GB/sec ] - Array size: 1024 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Note&lt;/STRONG&gt;: Preformance is less affected. &lt;STRONG&gt;MBS&lt;/STRONG&gt; value is inaccurate.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Conclusion&lt;/SPAN&gt;&lt;/STRONG&gt;: Based on my tests a&lt;STRONG&gt;512&lt;/STRONG&gt; MB size for the test &lt;STRONG&gt;Array&lt;/STRONG&gt; gives the most accurate values for &lt;STRONG&gt;MBS&lt;/STRONG&gt;.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Feb 2012 05:28:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808350#M821</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-17T05:28:42Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808351#M822</link>
      <description>Another"variable" that affect &lt;STRONG&gt;MBS&lt;/STRONG&gt; value is a &lt;STRONG&gt;C/C++ compiler&lt;/STRONG&gt;. In thattest theapplication was compiled&lt;BR /&gt;by a&lt;STRONG&gt;MinGW C/C++ compiler&lt;/STRONG&gt;:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Process Priority REALTIME ( PPR ) - Array size 512 MB&lt;/STRONG&gt;:&lt;P&gt;&lt;BR /&gt; Test 01: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 02: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 03: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 04: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 05: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 06: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 07: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 08: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 09: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 10: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 11: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 12: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 13: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 14: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 15: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;BR /&gt; Test 16: Memory Bandwidth [ 1986.422 MB/sec 1.940 GB/sec ] Array size: 512 MB&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;Max MBS for PPR&lt;/STRONG&gt;: [ &lt;SPAN style="text-decoration: underline;"&gt;1986.422&lt;/SPAN&gt; MB/sec &lt;SPAN style="text-decoration: underline;"&gt;1.940&lt;/SPAN&gt; GB/sec ] - Array size: 512 MB&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;BestMBS value&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;1986.422&lt;/STRONG&gt; MB/sec ( &lt;STRONG&gt;1.940&lt;/STRONG&gt; GB/sec ) for &lt;STRONG&gt;Array size of 512 MB&lt;/STRONG&gt; ( a process prioritywas&lt;STRONG&gt;Realtime&lt;/STRONG&gt; )&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Summary&lt;/SPAN&gt;&lt;/STRONG&gt;:&lt;/P&gt;&lt;P&gt; &lt;STRONG&gt;MBS&lt;/STRONG&gt;: &lt;SPAN style="text-decoration: underline;"&gt;1986.422&lt;/SPAN&gt; MB/sec ( &lt;SPAN style="text-decoration: underline;"&gt;1.940&lt;/SPAN&gt; GB/sec ) - Array size of 512 MB ( &lt;STRONG&gt;100.00&lt;/STRONG&gt; % ) - &lt;STRONG&gt;MinGW&lt;/STRONG&gt; - &lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;Best Value&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt; &lt;STRONG&gt;MBS&lt;/STRONG&gt;: 1932.735 MB/sec ( 1.887 GB/sec ) - Array size of 512 MB (  &lt;STRONG&gt;97.30&lt;/STRONG&gt; % ) - &lt;STRONG&gt;MSVC&lt;/STRONG&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 17 Feb 2012 05:47:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808351#M822</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-17T05:47:11Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808352#M823</link>
      <description>&lt;P&gt;Hey Sergey,&lt;BR /&gt;Here is the approach I've adopted after a dozen years of measuring memory bw.&lt;BR /&gt;For my own memory/latency utility, I use assembly code.&lt;BR /&gt;This avoids the variation in performance due to different compilers.&lt;BR /&gt;My utility has to work on all variations oflinux, windows, android, etc.&lt;BR /&gt;This approach means I have 4 asm files (32 &amp;amp; 64 bit windows, 32 &amp;amp; 64 bit linux) but I rarely have to change them.&lt;BR /&gt;In general I'm more interested in the relative performance of systems than I am in the absolute best performance.&lt;BR /&gt;That is, internally folks use my utility to compare box1 to box2, and, if the bw is off by much, then they have to start digging to see where the difference is. Or some folks run my utility before each benchmark measurement they take to see if someone haschanged (changed the DIMMs, or bios settings) the box (usually a shared box in a lab).&lt;BR /&gt;My approach usually gets within a few percent of the absolute best performance.&lt;BR /&gt;This works ok for our platform debug, sanity check purposes.&lt;BR /&gt;For external (outside of Intel) purposes, we usually use the stream benchmark. See &lt;A href="http://www.cs.virginia.edu/stream"&gt;http://www.cs.virginia.edu/stream&lt;/A&gt;. &lt;BR /&gt;I don't use stream much but it is pretty much the industry standard for mem bw numbers.&lt;BR /&gt;Roman has a paper on using PCM to dissect stream at &lt;A href="http://software.intel.com/en-us/blogs/2010/11/23/dissecting-stream-benchmark-with-intel-performance-counter-monitor/" target="_blank"&gt;http://software.intel.com/en-us/blogs/2010/11/23/dissecting-stream-benchmark-with-intel-performance-counter-monitor/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;My 'simpler' mem bw tests (a read test, a write test, a latency test) also allow me to test various memory bw related perfmon events. &lt;BR /&gt;The read test lets mecheck demand read miss events.&lt;BR /&gt;The write test lets me check writeback events.&lt;BR /&gt;The latency test can be used to check the latency events.&lt;BR /&gt;Pat&lt;/P&gt;</description>
      <pubDate>Fri, 17 Feb 2012 14:05:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808352#M823</guid>
      <dc:creator>Patrick_F_Intel1</dc:creator>
      <dc:date>2012-02-17T14:05:37Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808353#M824</link>
      <description>Hi Patrick,&lt;BR /&gt;&lt;BR /&gt;Thank you and I really appreciate your comments!&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey</description>
      <pubDate>Sat, 18 Feb 2012 00:18:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808353#M824</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-18T00:18:33Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808354#M825</link>
      <description>Sergey,&lt;BR /&gt;&lt;BR /&gt;If you want to measure bandwidth with more accuracy, you also need to:&lt;BR /&gt;&lt;BR /&gt;- Increase process working set size to be larger than the sum of program+dll size and src/dst/buf size&lt;BR /&gt;- Lock the thread affinity so that thread doesn't get swapped to another core (avoid core 0 and logical cores)&lt;BR /&gt;- Lock pages in memory to disable swapping (this needs a privilege enabled for the admin account)&lt;BR /&gt;- Touch all the pages in src/dst/buf after locking them to eliminate page faults before you start&lt;BR /&gt;&lt;BR /&gt;To maximize the bandwidth, you may consider using a blocking+streaming approach:&lt;BR /&gt;&lt;BR /&gt;1. Read 2KB of data from memory into a buffer (64/128 bytes per loop iteration, movaps, use prefetchnta*)&lt;BR /&gt;2. Stream those 2KB of data to memory (64/128 bytes per loop iteration, movntps)&lt;BR /&gt;3. Repeat until you copy all data&lt;BR /&gt;&lt;BR /&gt;* - Prefetch data using prefetchnta (optimal prefetch distance is best determined by trial and error)&lt;BR /&gt;&lt;BR /&gt;Hope this helps.&lt;BR /&gt;</description>
      <pubDate>Sat, 18 Feb 2012 01:28:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808354#M825</guid>
      <dc:creator>levicki</dc:creator>
      <dc:date>2012-02-18T01:28:07Z</dc:date>
    </item>
    <item>
      <title>Measuring Memory Bandwidth of a System ( MBS )</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808355#M826</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1329588869703="58" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=61352" href="https://community.intel.com/en-us/profile/61352/" class="basic"&gt;Igor Levicki&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;...&lt;BR /&gt;- Increase process working set size to be larger than the sum of program+dll size and src/dst/buf size&lt;BR /&gt;- Lock the thread affinity so that thread doesn't get swapped to another core (avoid core 0 and logical cores)&lt;BR /&gt;...&lt;BR /&gt;&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;BR /&gt;These two notes are the most applicable in my case. Thanks, Igor.&lt;/P&gt;</description>
      <pubDate>Sat, 18 Feb 2012 18:16:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808355#M826</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2012-02-18T18:16:22Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt;&gt;Yeah, many utilities raise</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808356#M827</link>
      <description>&amp;gt;&amp;gt;&amp;gt;Yeah, many utilities raise the priority of the threads doing the memory bw test.
Also, if you are doing a single-threaded memory bw test, it can matter on which cpu you are running.
On some operating systems, cpu 0 for example, can be pretty busy and the mem bw thread won't get as much time as on other cpus.&amp;gt;&amp;gt;&amp;gt;

If you are on NUMA system the memory testing gets even more complicated , because of NUMA distances beign involved in memory accesses.</description>
      <pubDate>Mon, 31 Dec 2012 20:02:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808356#M827</guid>
      <dc:creator>Bernard</dc:creator>
      <dc:date>2012-12-31T20:02:06Z</dc:date>
    </item>
    <item>
      <title>Hi Igor, This is a short</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808357#M828</link>
      <description>Hi Igor, This is a short follow up...

&amp;gt;&amp;gt;...
&amp;gt;&amp;gt;To maximize the bandwidth, you may consider using a blocking+streaming approach:
&amp;gt;&amp;gt;
&amp;gt;&amp;gt;1. Read 2KB of data from memory into a buffer (64/128 bytes per loop iteration, movaps, use prefetchnta*)
&amp;gt;&amp;gt;2. Stream those 2KB of data to memory (64/128 bytes per loop iteration, movntps)

I actually used a 4KB steps when prefetching data in a for- loop iteration.

&amp;gt;&amp;gt;3. Repeat until you copy all data
&amp;gt;&amp;gt;
&amp;gt;&amp;gt;* - &lt;STRONG&gt;Prefetch data using prefetchnta&lt;/STRONG&gt; (optimal prefetch distance is best determined by trial and error)

I recently done a couple of tests with a function that uses streaming SSE instructions to copy memory blocks and in the best scenario a performance improvement was &lt;STRONG&gt;~9%&lt;/STRONG&gt; (!). I consider it as a very good result and the function significantly outperformed a standard CRT function &lt;STRONG&gt;memcpy&lt;/STRONG&gt;.

Thanks for your notes.</description>
      <pubDate>Sat, 19 Jan 2013 04:01:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Measuring-Memory-Bandwidth-of-a-System-MBS/m-p/808357#M828</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-01-19T04:01:18Z</dc:date>
    </item>
  </channel>
</rss>

