<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Why does _spin_lock has such high CPI in VTune report? in Analyzers</title>
    <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985606#M10372</link>
    <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Hi Tim,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Yeah, it seems a single lock instruction in _spin_lock can cost 70 clock cycles. If we add up the clock cycles of other instructions in _spin_lock, a 29 CPI is a reasonable result.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Thanks,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Liang&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;&lt;P&gt;Message Edited by mfcking@yahoo.com on &lt;SPAN class="date_text"&gt;07-13-2005&lt;/SPAN&gt; &lt;SPAN class="time_text"&gt;10:44 AM&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 13 Jul 2005 04:29:13 GMT</pubDate>
    <dc:creator>mfcking</dc:creator>
    <dc:date>2005-07-13T04:29:13Z</dc:date>
    <item>
      <title>Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985600#M10366</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Hello,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I used VTune 3.0 to sample the spin lock activitesinvoked bythe e1000 Gigabit driver and the Linux kernel 2.6.12.I found the CPI of _spin_lock is almost 27while _spin_lock has 100%L2 cache hit rate.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I checked the assembly code of _spin_lock in Linux and it uses the LOCK instruction.Based on IA32 optimization manual,&lt;FONT size="2"&gt;the LOCK prefix does not lock the FSB once the referred data is found in the L2 cache of local CPU. However, it also goes to say that, Locked instructions are inherently slow, whether the data to be locked in found in the L2 cache or not. &lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;I still do not understand what caused the CPI of _spin_lock so high? &lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Thanks a lot,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;L.Y.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;_spin_lock code in Linux&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;1: lock; decb slp# atomically decrement &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;&lt;SPAN&gt; &lt;/SPAN&gt;jns&lt;SPAN&gt; &lt;/SPAN&gt;3f&lt;SPAN&gt; &lt;/SPAN&gt;# if clear sign bit jump forward to 3 &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;2: cmpb $0,slp&lt;SPAN&gt;  &lt;/SPAN&gt;# spin  compare to 0 &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;&lt;SPAN&gt; &lt;/SPAN&gt;pause&lt;SPAN&gt;  &lt;/SPAN&gt;# spin  wait &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;&lt;SPAN&gt; &lt;/SPAN&gt;jle 2b&lt;SPAN&gt; &lt;/SPAN&gt;# spin  go back to 2 if &amp;lt;= 0 (locked) &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;SPAN&gt;&lt;FONT size="1"&gt;&lt;SPAN&gt; &lt;/SPAN&gt;jmp 1b&lt;SPAN&gt; &lt;/SPAN&gt;# unlocked; go back to 1 to try to lock again &lt;/FONT&gt;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 20 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;FONT size="1"&gt;&lt;SPAN&gt;3:&lt;SPAN&gt; &lt;/SPAN&gt;# we have acquired the lock&lt;/SPAN&gt;&lt;SPAN&gt; &lt;/SPAN&gt;&lt;SPAN&gt; &lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV style="mso-line-spacing: '100 50 0'; mso-margin-left-alt: 216; mso-char-wrap: 1; mso-kinsoku-overflow: 1"&gt;&lt;FONT size="2"&gt;&lt;/FONT&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;SPAN class="time_text"&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Message Edited by mfcking@yahoo.com on &lt;SPAN class="date_text"&gt;07-11-2005&lt;/SPAN&gt; &lt;SPAN class="time_text"&gt;03:09 PM&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jul 2005 01:39:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985600#M10366</guid>
      <dc:creator>mfcking</dc:creator>
      <dc:date>2005-07-12T01:39:44Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985601#M10367</link>
      <description>&lt;DIV&gt;Just curious here, L.Y. Do you have calibration enabled or disabled in your sampling session? If you aren't sure, it's hard to guess because calibration is off by default for some events, and on by default for others. &lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;If disabled, enable it and report back here what you see, the difference, if any.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;cheers&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;jdg&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;For more:&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;$ man sampling&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;But in case this rings a bell, use "-cal yes" to turn it on, "-cal no" to turn it off in the syntax.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Tue, 12 Jul 2005 21:50:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985601#M10367</guid>
      <dc:creator>jeffrey-gallagher</dc:creator>
      <dc:date>2005-07-12T21:50:58Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985602#M10368</link>
      <description>&lt;DIV&gt;One more interesting question is whether you run it on a Multi-CPU machine?What about HT?&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;If there is some way of parallelism, two threads accessing the same variable, or even different variables on the same cache line can cause large number of L2 cache misses.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Boaz.&lt;/DIV&gt;</description>
      <pubDate>Tue, 12 Jul 2005 22:37:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985602#M10368</guid>
      <dc:creator>Boaz_T_Intel</dc:creator>
      <dc:date>2005-07-12T22:37:35Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985603#M10369</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;Yes, I did run my testing on SMP(2 Xeon) with HTdisabled.&lt;P&gt;Message Edited by mfcking@yahoo.com on &lt;SPAN class="date_text"&gt;07-12-2005&lt;/SPAN&gt; &lt;SPAN class="time_text"&gt;09:19 AM&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jul 2005 23:19:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985603#M10369</guid>
      <dc:creator>mfcking</dc:creator>
      <dc:date>2005-07-12T23:19:26Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985604#M10370</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;P&gt;Hi JDG, &lt;/P&gt;
&lt;P&gt;I enabled calibration for all the events and the result is even worse (now is 29 and the CPI without calibration is 27): &lt;/P&gt;
&lt;P&gt;FunctionClockticks per Instructions Retired (CPI) (261)&lt;/P&gt;
&lt;P&gt;"_spin_lock" "29.153" &lt;/P&gt;
&lt;P&gt;2nd-Level Cache Load Hit Rate (261)&lt;/P&gt;
&lt;P&gt;"100.000" &lt;/P&gt;
&lt;P&gt;Thanks, &lt;/P&gt;
&lt;P&gt;L.Y.&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;&lt;P&gt;Message Edited by mfcking@yahoo.com on &lt;SPAN class="date_text"&gt;07-12-2005&lt;/SPAN&gt; &lt;SPAN class="time_text"&gt;01:01 PM&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jul 2005 02:36:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985604#M10370</guid>
      <dc:creator>mfcking</dc:creator>
      <dc:date>2005-07-13T02:36:46Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985605#M10371</link>
      <description>I'm trying to understand whether you think that high CPI in a spin lock loop is good or bad.  The usual goal would be to have the spin lock spend time as efficiently (issuing as few instructions) as possible, which clearly means a high CPI.  This would be particularly true if the spin lock loop could be competing for resources with another thread, which would be enabled to do useful work with a lower CPI than if it were competing against the spin lock.</description>
      <pubDate>Wed, 13 Jul 2005 03:24:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985605#M10371</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2005-07-13T03:24:58Z</dc:date>
    </item>
    <item>
      <title>Re: Why does _spin_lock has such high CPI in VTune report?</title>
      <link>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985606#M10372</link>
      <description>&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Hi Tim,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Yeah, it seems a single lock instruction in _spin_lock can cost 70 clock cycles. If we add up the clock cycles of other instructions in _spin_lock, a 29 CPI is a reasonable result.&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Thanks,&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;Liang&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;&lt;P&gt;Message Edited by mfcking@yahoo.com on &lt;SPAN class="date_text"&gt;07-13-2005&lt;/SPAN&gt; &lt;SPAN class="time_text"&gt;10:44 AM&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 13 Jul 2005 04:29:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Why-does-spin-lock-has-such-high-CPI-in-VTune-report/m-p/985606#M10372</guid>
      <dc:creator>mfcking</dc:creator>
      <dc:date>2005-07-13T04:29:13Z</dc:date>
    </item>
  </channel>
</rss>

