<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Intel Memory Latency Checker gives vastly different memory latency figures for different frequen in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-Memory-Latency-Checker-gives-vastly-different-memory/m-p/1366566#M8025</link>
    <description>&lt;P&gt;Most of the latency in a memory access is in the frequency domains of the core (including private caches) and "uncore" (including shared cache and on-chip ring or mesh). &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An old discussion that is still mostly relevant is at&amp;nbsp;&lt;A href="https://sites.utexas.edu/jdm4372/2011/03/10/memory-latency-components/" target="_blank"&gt;https://sites.utexas.edu/jdm4372/2011/03/10/memory-latency-components/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;For the server processors it is possible to control the "core" and "uncore" frequency independently, with no restrictions on the ranges of either. &amp;nbsp;With lots of measurements it is sometimes possible to generate an equation that accurately captures the latency in terms of core clocks, uncore clocks, and DRAM clocks. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;For client processors my impression is that the uncore frequency is less easily controlled. &amp;nbsp;According to volume 4 of the Intel SW Developer's Manual, in recent generations of Core processors, MSR 0x394 enables/disables the uncore fixed function (cycle) counter, while MSR 0x395 contains the 44-bit uncore cycle count. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;In some processors (e.g, Sandy Bridge EP, Haswell EP), I observed that with the default settings the uncore frequency would match the highest core frequency. &amp;nbsp;If that is the case on your system, then the numbers are not unreasonable. &amp;nbsp; If we assume that there is one part of the latency that is a fixed number of ns on both systems and another part of the latency that is a fixed number of core cycles, then your observations correspond to a fixed latency of 23 ns plus a variable latency of 126 core cycles.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is also possible that at the lower frequency the latency increases enough that the DRAM idle page timer activates and closes the DRAM page before the next access. &amp;nbsp;This will increase the DRAM part of the latency by another 12 ns or so, and may partially account for the large difference you are seeing....&lt;/P&gt;</description>
    <pubDate>Mon, 07 Mar 2022 23:56:09 GMT</pubDate>
    <dc:creator>McCalpinJohn</dc:creator>
    <dc:date>2022-03-07T23:56:09Z</dc:date>
    <item>
      <title>Intel Memory Latency Checker gives vastly different memory latency figures for different frequencies</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-Memory-Latency-Checker-gives-vastly-different-memory/m-p/1366270#M8021</link>
      <description>&lt;P&gt;I am using Intel Memory Latency Checker v3.9a to get the idle memory latency for my system. But I have noticed that the tool gives vastly different values for different core clock frequencies. For example, it shows 86 ns idle latency for 2 GHz vs 54.5 ns for 4 GHz. Shouldn't the memory latency remain more or less same at different CPU frequencies? Am I misunderstanding something?&lt;/P&gt;</description>
      <pubDate>Mon, 07 Mar 2022 07:29:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Intel-Memory-Latency-Checker-gives-vastly-different-memory/m-p/1366270#M8021</guid>
      <dc:creator>futurewassomewhere</dc:creator>
      <dc:date>2022-03-07T07:29:46Z</dc:date>
    </item>
    <item>
      <title>Re: Intel Memory Latency Checker gives vastly different memory latency figures for different frequen</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Intel-Memory-Latency-Checker-gives-vastly-different-memory/m-p/1366566#M8025</link>
      <description>&lt;P&gt;Most of the latency in a memory access is in the frequency domains of the core (including private caches) and "uncore" (including shared cache and on-chip ring or mesh). &amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;An old discussion that is still mostly relevant is at&amp;nbsp;&lt;A href="https://sites.utexas.edu/jdm4372/2011/03/10/memory-latency-components/" target="_blank"&gt;https://sites.utexas.edu/jdm4372/2011/03/10/memory-latency-components/&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;For the server processors it is possible to control the "core" and "uncore" frequency independently, with no restrictions on the ranges of either. &amp;nbsp;With lots of measurements it is sometimes possible to generate an equation that accurately captures the latency in terms of core clocks, uncore clocks, and DRAM clocks. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;For client processors my impression is that the uncore frequency is less easily controlled. &amp;nbsp;According to volume 4 of the Intel SW Developer's Manual, in recent generations of Core processors, MSR 0x394 enables/disables the uncore fixed function (cycle) counter, while MSR 0x395 contains the 44-bit uncore cycle count. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;In some processors (e.g, Sandy Bridge EP, Haswell EP), I observed that with the default settings the uncore frequency would match the highest core frequency. &amp;nbsp;If that is the case on your system, then the numbers are not unreasonable. &amp;nbsp; If we assume that there is one part of the latency that is a fixed number of ns on both systems and another part of the latency that is a fixed number of core cycles, then your observations correspond to a fixed latency of 23 ns plus a variable latency of 126 core cycles.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;It is also possible that at the lower frequency the latency increases enough that the DRAM idle page timer activates and closes the DRAM page before the next access. &amp;nbsp;This will increase the DRAM part of the latency by another 12 ns or so, and may partially account for the large difference you are seeing....&lt;/P&gt;</description>
      <pubDate>Mon, 07 Mar 2022 23:56:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Intel-Memory-Latency-Checker-gives-vastly-different-memory/m-p/1366566#M8025</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2022-03-07T23:56:09Z</dc:date>
    </item>
  </channel>
</rss>

