<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Yup -- the word &amp;quot;Xeon&amp;quot; is in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024028#M4120</link>
    <description>&lt;P&gt;Yup -- the word "Xeon" is somewhat overloaded in the documentation....&amp;nbsp;&amp;nbsp; In this case the giveaway is the reference to "the first 18 performance counters", since none of the more recent processors have this many....&lt;/P&gt;</description>
    <pubDate>Wed, 02 Mar 2016 16:02:20 GMT</pubDate>
    <dc:creator>McCalpinJohn</dc:creator>
    <dc:date>2016-03-02T16:02:20Z</dc:date>
    <item>
      <title>RDPMC Fast Mode</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024025#M4117</link>
      <description>&lt;P&gt;Hi all,&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I am currently writing a C++ class which measures performance using the RDPMC instruction.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Everything works as expected, but I noticed in the manual that some of the processors support "fast" mode of the RDPMC instruction (reading only the lower 32 bits of the counter). When I try to do it on mine (i.e. switching the ECX[31]) the code produces seg fault.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;This mode is supported on processors with 40 bit counters and the counters on my machine are 48 bit. The model name of my processor is "Intel(R) Xeon(R) CPU W3580".&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I was wondering if there is some equivalent of this "fast" mode for different processors and if not if it's possible to reduce the number of cycles which this instruction takes (currently ~30 cycles).&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Thanks,&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Georgi&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jul 2015 20:03:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024025#M4117</guid>
      <dc:creator>Georgi_G_</dc:creator>
      <dc:date>2015-07-23T20:03:22Z</dc:date>
    </item>
    <item>
      <title>The "fast mode" for the RDPMC</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024026#M4118</link>
      <description>&lt;P&gt;The "fast mode" for the RDPMC instruction is only relevant to some old processors -- Pentium 4 era, if I recall correctly.&lt;/P&gt;

&lt;P&gt;The ~30 cycles required for the RDPMC instruction appears to be a minimum requirement for the microcode, which executes about 35 uops for each RDPMC instruction.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Because the RDPMC instruction is not ordered with respect to surrounding instructions, it is not clear that making it execute in fewer cycles would result in more accurate measurements.&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;On the other hand, I typically read 11 performance counters at a time (8 programmable plus 3 fixed-function), which starts getting into non-trivial elapsed time -- about 128 ns at 3 GHz -- so I would certainly not mind if the instruction was faster (or if there were an option to read multiple counters with a single instruction).&lt;/P&gt;</description>
      <pubDate>Thu, 23 Jul 2015 20:54:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024026#M4118</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2015-07-23T20:54:19Z</dc:date>
    </item>
    <item>
      <title>Hi John --</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024027#M4119</link>
      <description>&lt;P&gt;Hi John --&lt;/P&gt;

&lt;P&gt;Thanks for your warning. &amp;nbsp; Like Georgi, I was tricked by the Intel Reference Manual into thinking that this mode still existed for current Xeon processors:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;The Pentium 4 and Intel Xeon processors also support “fast” (32-bit) and “slow” (40-bit) reads on the first 18 performance counters. Selected this option using ECX[31]. If bit 31 is set, RDPMC reads only the low 32 bits of the selected performance counter. If bit 31 is clear, all 40 bits are read. A 32-bit result is returned in EAX and EDX is set to 0. A 32-bit read executes faster on Pentium 4 processors and Intel Xeon processors than a full 40-bit read.&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;Perhaps someone from Intel could see about making this passage in the manual less of a trap for the unwary?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 06:26:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024027#M4119</guid>
      <dc:creator>Nathan_K_3</dc:creator>
      <dc:date>2016-03-02T06:26:41Z</dc:date>
    </item>
    <item>
      <title>Yup -- the word "Xeon" is</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024028#M4120</link>
      <description>&lt;P&gt;Yup -- the word "Xeon" is somewhat overloaded in the documentation....&amp;nbsp;&amp;nbsp; In this case the giveaway is the reference to "the first 18 performance counters", since none of the more recent processors have this many....&lt;/P&gt;</description>
      <pubDate>Wed, 02 Mar 2016 16:02:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/RDPMC-Fast-Mode/m-p/1024028#M4120</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2016-03-02T16:02:20Z</dc:date>
    </item>
  </channel>
</rss>

