<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Yes, in ring 0 you can read in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124608#M6247</link>
    <description>&lt;P&gt;Yes, in ring 0 you can read the core performance counters either using the RDPMC instruction or the RDMSR instruction.&amp;nbsp;&amp;nbsp; I don't know if there is a performance difference between the two approaches.&lt;/P&gt;</description>
    <pubDate>Thu, 05 Jan 2017 17:18:40 GMT</pubDate>
    <dc:creator>McCalpinJohn</dc:creator>
    <dc:date>2017-01-05T17:18:40Z</dc:date>
    <item>
      <title>Reading programmable and fixed-function performance counters</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124601#M6240</link>
      <description>&lt;P&gt;I'm working on a Unix-like x86 operating system and I need to measure the performance of some benchmarks in the system. It turns out that so far the OS does not have any tool to access hardware counters, so the only option I have is to access them directly.&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;On Linux I have already used tools like PAPI and perf but I don't know internally how they work and I am pretty lost in this regard.&lt;/SPAN&gt;&lt;/P&gt;
&lt;!--break--&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Please correct me if I am wrong:&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;•&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;For fixed-function counters, only the rdpmc instruction is enough and I just need to enable the 30bit plus the counter number in ECX, but here comes my first question, where in the Intel Developer's Manual&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt; I can find the relation of the counter number to use In ECX? I've seen something similar in this post (&lt;/SPAN&gt;&lt;A href="How to read performance counters by rdpmc instruction?" style="font-size: 1em; line-height: 1.5;"&gt;How to read performance counters by rdpmc instruction?&lt;/A&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;) but I would like to access all the counters (0-7, since I have a Sandy Bridge, i7 2600) and know what each one performs.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;•&amp;nbsp;For programmable counters, so far as I know, these are divided into architectural and non-architectural, the latter is specific to an architecture and the former can be supported by various architectures as long as it's informed via CPUID.&lt;/P&gt;

&lt;P&gt;This way, I need to configure the IA32_PERFEVTSELx and set the Event Select and Unit Mask fields.&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;In Volume 3, Table 18-1, I have a list of predefined events that I can use, I believe that for my processor, I can also use the data in Table 19-3.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;If I use the 'LLC Misses' event for instance, UMask = 41H / Event Select = 2EH. What counter number in ECX should I use in the rdpmc instruction?&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 02:12:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124601#M6240</guid>
      <dc:creator>Davidson_F_</dc:creator>
      <dc:date>2017-01-03T02:12:05Z</dc:date>
    </item>
    <item>
      <title>If you are only interested in</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124602#M6241</link>
      <description>&lt;P&gt;If you are only interested in counting (in contrast to sampling), PCM might be an alternative for you: &lt;A href="https://software.intel.com/en-us/articles/intel-performance-counter-monitor"&gt;https://software.intel.com/en-us/articles/intel-performance-counter-monitor&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Since PCM uses a MSR device driver, it might be not too difficult to port it to your OS.&lt;/P&gt;

&lt;P&gt;But even if you don't want to attempt a port, the code might still be useful as reference. For example, the PMUs are configured here: &lt;A href="https://github.com/opcm/pcm/blob/master/cpucounters.cpp#L1747"&gt;https://github.com/opcm/pcm/blob/master/cpucounters.cpp#L1747&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 15:34:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124602#M6241</guid>
      <dc:creator>Thomas_W_Intel</dc:creator>
      <dc:date>2017-01-03T15:34:14Z</dc:date>
    </item>
    <item>
      <title>If you are running in kernel</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124603#M6242</link>
      <description>&lt;P&gt;If you are running in kernel space (e.g., in a loadable kernel module), then you just need to go through the list of MSRs described in Chapter 18 of Volume 3 of the Intel Architectures SW Developer's Manual to enable and program the counters.&amp;nbsp;&amp;nbsp; I recommend starting with Section 18.2 on Architectural Performance Monitoring as an overview before you get into the processor-specific details in Section 18.9.&lt;/P&gt;

&lt;P&gt;If you are running in user space, you need an interface to the kernel that enables you to read/write these MSRs.&amp;nbsp; On Linux systems this is the /dev/cpu/*/msr device driver interface, but the functionality can also be provided by other kernel functions.&amp;nbsp;&amp;nbsp;&amp;nbsp; If you have such an interface, it should be relatively easy to port PCM or Likwid (https://github.com/RRZE-HPC/likwid).&amp;nbsp;&amp;nbsp; If you are running in user mode and don't have an interface to the MSRs, then you will probably need to build your own kernel module (assuming that is supported).&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 19:55:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124603#M6242</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2017-01-03T19:55:07Z</dc:date>
    </item>
    <item>
      <title>Thank you Thomas and John,</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124604#M6243</link>
      <description>&lt;P&gt;Thank you Thomas and John,&lt;/P&gt;

&lt;P&gt;Well, I have no interface in the kernel to access the MSRs. Although it is possible to write it, it would take me some time since besides the driver port I also need to make the tool port as you mentioned.&lt;/P&gt;

&lt;P&gt;I have easy access to kernel space and if so I can also write a small interface for user space, my biggest problem is in the basic understanding of the counters.&lt;/P&gt;

&lt;P&gt;What I want to do is something like this: &lt;A href="http://stackoverflow.com/questions/22421227/how-many-cache-misses-will-we-have-for-this-simple-program#answer-22421432" target="_blank"&gt;http://stackoverflow.com/questions/22421227/how-many-cache-misses-will-we-have-for-this-simple-program#answer-22421432&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;What I can not understand is: when I set up an event, on what counter can I retrieve the value of it? to use in rdpmc.&lt;/P&gt;

&lt;P&gt;I am looking for this information in Chapters 18 and 19 in the Intel Manual and I cannot find it.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Jan 2017 20:41:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124604#M6243</guid>
      <dc:creator>Davidson_F_</dc:creator>
      <dc:date>2017-01-03T20:41:26Z</dc:date>
    </item>
    <item>
      <title>For fixed counters, you can</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124605#M6244</link>
      <description>For fixed counters, you can browse my driver source code which handles architectures from "old" Core up to recent i7

&lt;A href="https://github.com/cyring/CoreFreq/blob/master/corefreqk.c" target="_blank"&gt;https://github.com/cyring/CoreFreq/blob/master/corefreqk.c&lt;/A&gt;</description>
      <pubDate>Tue, 03 Jan 2017 22:05:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124605#M6244</guid>
      <dc:creator>CyrIng</dc:creator>
      <dc:date>2017-01-03T22:05:18Z</dc:date>
    </item>
    <item>
      <title>There is a 1:1 correspondence</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124606#M6245</link>
      <description>&lt;P&gt;There is a 1:1 correspondence between the IA32_PERFEVTSEL&lt;N&gt; MSRs that control the programmable counters and the IA32_PMC&lt;N&gt; MSRs that contain the counts.&amp;nbsp; The &lt;N&gt; here is the counter number (0-1, 0-3 or 0-7, depending on the processor and whether HyperThreading is enabled), and this same &lt;N&gt; is the value placed in the ECX register prior to executing the RDPMC instruction.&lt;/N&gt;&lt;/N&gt;&lt;/N&gt;&lt;/N&gt;&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;Counter, Control MSR, Count MSR&lt;/P&gt;

	&lt;P&gt;0, 0x186, 0xC1&lt;/P&gt;

	&lt;P&gt;1, 0x187, 0xC2&lt;/P&gt;

	&lt;P&gt;2, 0x188, 0xC3&lt;/P&gt;

	&lt;P&gt;etc....&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;The trick with setting bit 30 of ECX to access the fixed function counters is probably confusing if you learn about that before you learn the "normal" way of accessing the counters (with ECX set to the programmable counter number).&lt;/P&gt;</description>
      <pubDate>Wed, 04 Jan 2017 14:42:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124606#M6245</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2017-01-04T14:42:36Z</dc:date>
    </item>
    <item>
      <title>Quote:CyrIng wrote:</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124607#M6246</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;CyrIng wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;For fixed counters, you can browse my driver source code which handles architectures from "old" Core up to recent i7&lt;BR /&gt;
	&lt;A href="https://github.com/cyring/CoreFreq/blob/master/corefreqk.c" target="_blank"&gt;https://github.com/cyring/CoreFreq/blob/master/corefreqk.c&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;That's a nice code and project, very well written and understandable, certainly I will check this out later.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Mccalpin, John wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;There is a 1:1 correspondence between the IA32_PERFEVTSEL&lt;N&gt; MSRs that control the programmable counters and the IA32_PMC&lt;N&gt; MSRs that contain the counts. &amp;nbsp;The &lt;N&gt; here is the counter number (0-1, 0-3 or 0-7, depending on the processor and whether HyperThreading is enabled), and this same &lt;N&gt; is the value placed in the ECX register prior to executing the RDPMC instruction.&lt;/N&gt;&lt;/N&gt;&lt;/N&gt;&lt;/N&gt;&lt;/P&gt;

&lt;P&gt;Counter, Control MSR, Count MSR&lt;BR /&gt;
	0, 0x186, 0xC1&lt;BR /&gt;
	1, 0x187, 0xC2&lt;BR /&gt;
	2, 0x188, 0xC3&lt;BR /&gt;
	etc....&lt;BR /&gt;
	The trick with setting bit 30 of ECX to access the fixed function counters is probably confusing if you learn about that before you learn the "normal" way of accessing the counters (with ECX set to the programmable counter number).&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;That's exactly what I was looking for. Certainly my first mistake was learn fixed-counters before the 'normal' way.&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Now the things makes sense, this also means that I can read/write values using the MSRs directly instead of using rdpmc, since in ring 0.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Thank you all!&lt;/P&gt;</description>
      <pubDate>Wed, 04 Jan 2017 23:13:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124607#M6246</guid>
      <dc:creator>Davidson_F_</dc:creator>
      <dc:date>2017-01-04T23:13:11Z</dc:date>
    </item>
    <item>
      <title>Yes, in ring 0 you can read</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124608#M6247</link>
      <description>&lt;P&gt;Yes, in ring 0 you can read the core performance counters either using the RDPMC instruction or the RDMSR instruction.&amp;nbsp;&amp;nbsp; I don't know if there is a performance difference between the two approaches.&lt;/P&gt;</description>
      <pubDate>Thu, 05 Jan 2017 17:18:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124608#M6247</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2017-01-05T17:18:40Z</dc:date>
    </item>
    <item>
      <title>You might also like to take a</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124609#M6248</link>
      <description>&lt;P&gt;You might also like to take a look at &lt;A href="https://github.com/obilaniu/libpfc"&gt;https://github.com/obilaniu/libpfc&lt;/A&gt; which is a small kernel module and userspace library for &lt;EM&gt;just&lt;/EM&gt; donig what you want: reading performance counters.&lt;/P&gt;

&lt;P&gt;It works out of the box, but it would be a great target for porting to another architecture, since most of the details are about handling the x86-specific stuff to enable/read the counters.&lt;/P&gt;</description>
      <pubDate>Tue, 31 Jan 2017 22:12:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124609#M6248</guid>
      <dc:creator>Travis_D_</dc:creator>
      <dc:date>2017-01-31T22:12:16Z</dc:date>
    </item>
    <item>
      <title>Quote:Travis D. wrote:</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124610#M6249</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Travis D. wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;You might also like to take a look at &lt;A href="https://github.com/obilaniu/libpfc" rel="nofollow"&gt;https://github.com/obilaniu/libpfc&lt;/A&gt; which is a small kernel module and userspace library for &lt;EM&gt;just&lt;/EM&gt; donig what you want: reading performance counters.&lt;/P&gt;

&lt;P&gt;It works out of the box, but it would be a great target for porting to another architecture, since most of the details are about handling the x86-specific stuff to enable/read the counters.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Hello Travis D,&lt;/P&gt;

&lt;P&gt;I already knew this repository, in fact it's very interesting and simple to understand.&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I believe it's not difficult to port it to other systems and It would be interesting if the author made it more robust since it's assuming a lot of things, even so, it's a great way to understand PMC's.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Currently I am comparing this kernel module with my implementation since I am having some issues regards the event 'LLC Misses', that I always get the value 0, no matter what benchmark I use,&amp;nbsp;in the other events I get correct values.&lt;/P&gt;</description>
      <pubDate>Wed, 01 Feb 2017 00:13:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124610#M6249</guid>
      <dc:creator>Davidson_F_</dc:creator>
      <dc:date>2017-02-01T00:13:21Z</dc:date>
    </item>
    <item>
      <title>[quote=Davidson F.]</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124611#M6250</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Davidson F. wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Hello Travis D,&lt;/P&gt;

&lt;P&gt;I already knew this repository, in fact it's very interesting and simple to understand.&amp;nbsp;I believe it's not difficult to port it to other systems and It would be interesting if the author made it more robust since it's assuming a lot of things, even so, it's a great way to understand PMC's.&lt;/P&gt;

&lt;P&gt;Currently I am comparing this kernel module with my implementation since I am having some issues regards the event 'LLC Misses', that I always get the value 0, no matter what benchmark I use,&amp;nbsp;in the other events I get correct values.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;If you have any changes that can make the library more robust, I'm sure the author would be happy to pull them. It is, after all, free and open source software.&lt;/P&gt;</description>
      <pubDate>Wed, 10 May 2017 23:15:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124611#M6250</guid>
      <dc:creator>Travis_D_</dc:creator>
      <dc:date>2017-05-10T23:15:00Z</dc:date>
    </item>
    <item>
      <title>Just to update, I have not</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124612#M6251</link>
      <description>&lt;P&gt;Just to update, I have not managed to find why LLC Misses does not work in my processor, but, in other processors works fine, so I just used another processor.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Travis D. wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;If you have any changes that can make the library more robust, I'm sure the author would be happy to pull them. It is, after all, free and open source software.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Regarding libpfc, I have not made changes on this lib yet but when I do for sure I will send a PR for the author =).&lt;/P&gt;</description>
      <pubDate>Thu, 11 May 2017 00:41:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/Reading-programmable-and-fixed-function-performance-counters/m-p/1124612#M6251</guid>
      <dc:creator>Davidson_F_</dc:creator>
      <dc:date>2017-05-11T00:41:34Z</dc:date>
    </item>
  </channel>
</rss>

