<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Thanks for your reply. in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099831#M5830</link>
    <description>&lt;P&gt;Thanks for your reply.&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;Let's put aside the problem &lt;/SPAN&gt;&lt;/SPAN&gt;'remote DRAM write requests of&amp;nbsp;each core'&lt;SPAN class="short_text" lang="en"&gt;&lt;SPAN&gt;. Can I use the PMU(may be the Cbox) to detect &lt;/SPAN&gt;&lt;/SPAN&gt;'remote DRAM read requests of&amp;nbsp;each core'?&lt;/P&gt;

&lt;P&gt;&lt;SPAN id="result_box" lang="en"&gt;&lt;SPAN&gt;Because my previous projects are based on the PMU implementation, but I do not know core-local events, do not know how to use my original project in the MSR interface implementation, I want to ask you have information on this convenient so that I can&lt;/SPAN&gt; &lt;SPAN&gt;Fast understanding and use?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 22 Feb 2017 01:01:00 GMT</pubDate>
    <dc:creator>Duan_Z_</dc:creator>
    <dc:date>2017-02-22T01:01:00Z</dc:date>
    <item>
      <title>How to measure remote read or write request of a core using PMU？</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099829#M5828</link>
      <description>&lt;P&gt;I want to measure the count of remote dram read or write request&amp;nbsp;of&amp;nbsp;each core.&lt;/P&gt;

&lt;P&gt;So I use the Cbox.But I can't found the base event.&lt;/P&gt;

&lt;P&gt;But I found pmu_tool (https://github.com/andikleen/pmu-tools)​&amp;nbsp; has the event "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_DRAM" can measure&amp;nbsp;the count&amp;nbsp;of each core.&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;Then I found the definition of this event , the event code is "&lt;/SPAN&gt;&lt;/SPAN&gt;0xB7, 0xBB&lt;SPAN class="short_text" lang="en"&gt;&lt;SPAN&gt;" and the umask is "&lt;/SPAN&gt;&lt;/SPAN&gt;&amp;nbsp;0x01&lt;SPAN class="short_text" lang="en"&gt;&lt;SPAN&gt;"&lt;/SPAN&gt;&lt;/SPAN&gt;.But it doesn't said using which box.&lt;/P&gt;

&lt;P&gt;So I check it in Intel® Xeon® Processor E5 and E7 v3 Family Uncore Performance Monitoring Reference Manual​ b&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;ut can not find the event related to it.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;I want to know h&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;ow to implement this event using the PMU to &lt;/SPAN&gt;&lt;/SPAN&gt;measure the remote read/write count of each core.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2017 01:54:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099829#M5828</guid>
      <dc:creator>Duan_Z_</dc:creator>
      <dc:date>2017-02-21T01:54:39Z</dc:date>
    </item>
    <item>
      <title>Hi,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099830#M5829</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;the OFFCORE_RESPONSE events are core-local events thus are not related to any 'box' (MSR_PMC0-7 with config register MSR_PERFEVTSEL0-7). You can program them through the MSR interface.&lt;BR /&gt;
	Depending on the event code "0xB7, 0xBB", you have to program the filter bitmask (many tools use human-readable names that specify a bitmask) in either register 0x1A6 or 0x1A7 (event 0xB7 -&amp;gt; register 0x1A6). Which bits can be set and their meaning is architecture specific and can be found in the SDM or in the matrix_bit_definitions files at &lt;A href="https://download.01.org/perfmon"&gt;https://download.01.org/perfmon&lt;/A&gt; . The matrix_bit_definitions files use the same names as pmu-tools thus it should be easy to retrieve the bitmask that is used by pmu-tools.&lt;BR /&gt;
	&lt;BR /&gt;
	You will probably have a problem with 'remote DRAM write requests of&amp;nbsp;each core'. The measurement facility is located between a core's private L2 and the 'ring' interconnect of the Uncore (L3, DRAM, QPI, ...). With this location, it is not possible to assign evicted cache lines from L3 to DRAM to a single CPU core. The 'COREWB' bit (WriteBack) in&amp;nbsp;0x1A6 or 0x1A7 counts dirty cache lines evicted from L2 to L3 and not from L3 to DRAM, so you cannot use it for memory bandwidth in general. (This paragraph is loosely cited from an email of Dr. Bandwidth to the PAPI mailing list)&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2017 17:05:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099830#M5829</guid>
      <dc:creator>Thomas_G_4</dc:creator>
      <dc:date>2017-02-21T17:05:59Z</dc:date>
    </item>
    <item>
      <title>Thanks for your reply.</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099831#M5830</link>
      <description>&lt;P&gt;Thanks for your reply.&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;Let's put aside the problem &lt;/SPAN&gt;&lt;/SPAN&gt;'remote DRAM write requests of&amp;nbsp;each core'&lt;SPAN class="short_text" lang="en"&gt;&lt;SPAN&gt;. Can I use the PMU(may be the Cbox) to detect &lt;/SPAN&gt;&lt;/SPAN&gt;'remote DRAM read requests of&amp;nbsp;each core'?&lt;/P&gt;

&lt;P&gt;&lt;SPAN id="result_box" lang="en"&gt;&lt;SPAN&gt;Because my previous projects are based on the PMU implementation, but I do not know core-local events, do not know how to use my original project in the MSR interface implementation, I want to ask you have information on this convenient so that I can&lt;/SPAN&gt; &lt;SPAN&gt;Fast understanding and use?&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2017 01:01:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099831#M5830</guid>
      <dc:creator>Duan_Z_</dc:creator>
      <dc:date>2017-02-22T01:01:00Z</dc:date>
    </item>
    <item>
      <title>Thanks a lot，</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099832#M5831</link>
      <description>&lt;P&gt;Thanks a lot，&lt;/P&gt;

&lt;P&gt;I already use the memory controllers (on the remote socket)&amp;nbsp;to&amp;nbsp;&lt;SPAN&gt;&lt;SPAN&gt;simulation something on&amp;nbsp;the situation&lt;/SPAN&gt;&lt;/SPAN&gt;&amp;nbsp;when local socket just have one core (others are shut down).&lt;/P&gt;

&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;But when this simulator came to the case of multi-core, imc did not know that this read and write instructions from which core.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;I'm sorry about the problem of msr or core-local I don't say it clearly. &lt;SPAN&gt;&lt;SPAN&gt;I set aside a complete set of msr interfaces and implemented.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN&gt;&lt;SPAN&gt;​But I don't have a sheet about &lt;/SPAN&gt;&lt;/SPAN&gt;core-local(or msr) events(I must say that I just know a little about it).&lt;/P&gt;

&lt;P&gt;Can you give some sheet about MSR or core-local events?&lt;/P&gt;

&lt;P&gt;Thanks for your help.&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2017 09:34:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/How-to-measure-remote-read-or-write-request-of-a-core-using-PMU/m-p/1099832#M5831</guid>
      <dc:creator>Duan_Z_</dc:creator>
      <dc:date>2017-02-22T09:34:03Z</dc:date>
    </item>
  </channel>
</rss>

