<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic What parameters did you pass in Software Tuning, Performance Optimization &amp; Platform Monitoring</title>
    <link>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130717#M6403</link>
    <description>&lt;P&gt;What parameters did you pass to "perf stat"?&lt;/P&gt;&lt;P&gt;The cache-related events could be caused by code executed before and/or after the infinite loop that *appears* in the source code. This may include the C/C++ runtime initialization code.&lt;/P&gt;&lt;P&gt;You can remove the infinite loop and see how the event counts change. It may be useful to inspect the generated assembly code either way.&lt;/P&gt;&lt;P&gt;Note that the standard deviation is so high that the event counts are mostly useless. This is probably because the events are counted over a (short) period of time with highly undeterministic dynamic behavior.&lt;/P&gt;</description>
    <pubDate>Mon, 22 Apr 2019 14:50:44 GMT</pubDate>
    <dc:creator>HadiBrais</dc:creator>
    <dc:date>2019-04-22T14:50:44Z</dc:date>
    <item>
      <title>why simplest code can cause high cache-references,mem-stores using perf stat</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130716#M6402</link>
      <description>&lt;P&gt;Hi Everyone,&lt;/P&gt;&lt;P&gt;I write a&amp;nbsp;simplest test code with &lt;STRONG&gt;infinite loop&lt;/STRONG&gt;，then&amp;nbsp;run it on a sever with Linux\ Xeon&amp;nbsp;6261 CPU，which is pin on a specific core.&amp;nbsp;&lt;/P&gt;&lt;P&gt;test code:&lt;/P&gt;&lt;P&gt;&lt;EM&gt;int main(int ac, char **av)&lt;BR /&gt;{&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;loop:&lt;BR /&gt;&amp;nbsp; &amp;nbsp; goto loop;&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;&amp;nbsp; &amp;nbsp; return 0;&lt;BR /&gt;}&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;Then,use &lt;STRONG&gt;perf stat&lt;/STRONG&gt; to obseve the performance，but return with dramatic result：&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp;3,180,914,057 &amp;nbsp; &amp;nbsp; &amp;nbsp;cycles &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(39.99%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp;3,171,163,384 &amp;nbsp; &amp;nbsp; &amp;nbsp;instructions &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;# &amp;nbsp; &amp;nbsp;1.00 &amp;nbsp;insn per cycle &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (49.99%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7,911 &amp;nbsp; &amp;nbsp; &amp;nbsp;cache-misses &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;# &amp;nbsp; 23.217 % of all cache refs &amp;nbsp; &amp;nbsp; &amp;nbsp;(49.99%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 33,128 &amp;nbsp; &amp;nbsp; &amp;nbsp;cache-references &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(50.00%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1,411 &amp;nbsp; &amp;nbsp; &amp;nbsp;LLC-load-misses &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; # &amp;nbsp; 14.92% of all LL-cache hits &amp;nbsp; &amp;nbsp; (50.01%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;9,730 &amp;nbsp; &amp;nbsp; &amp;nbsp;LLC-loads &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (50.01%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;225 &amp;nbsp; &amp;nbsp; &amp;nbsp;LLC-store-misses &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(20.00%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1,573 &amp;nbsp; &amp;nbsp; &amp;nbsp;LLC-store &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (20.00%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;0 &amp;nbsp; &amp;nbsp; &amp;nbsp;mem-loads &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; (29.99%)&lt;BR /&gt;&amp;nbsp; &amp;nbsp; 15.006456113 &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;678,645 &amp;nbsp; &amp;nbsp; &amp;nbsp;mem-stores &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;(39.99%)&lt;/P&gt;&lt;P&gt;I am very confused&amp;nbsp;of the big value of&amp;nbsp;cache,LLC and mem, especially&amp;nbsp;&amp;nbsp;&lt;STRONG&gt;678,645&lt;/STRONG&gt; of &lt;EM&gt;mem-stores.&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks and best regards.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 Apr 2019 08:02:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130716#M6402</guid>
      <dc:creator>liang__zhang</dc:creator>
      <dc:date>2019-04-19T08:02:38Z</dc:date>
    </item>
    <item>
      <title>What parameters did you pass</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130717#M6403</link>
      <description>&lt;P&gt;What parameters did you pass to "perf stat"?&lt;/P&gt;&lt;P&gt;The cache-related events could be caused by code executed before and/or after the infinite loop that *appears* in the source code. This may include the C/C++ runtime initialization code.&lt;/P&gt;&lt;P&gt;You can remove the infinite loop and see how the event counts change. It may be useful to inspect the generated assembly code either way.&lt;/P&gt;&lt;P&gt;Note that the standard deviation is so high that the event counts are mostly useless. This is probably because the events are counted over a (short) period of time with highly undeterministic dynamic behavior.&lt;/P&gt;</description>
      <pubDate>Mon, 22 Apr 2019 14:50:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130717#M6403</guid>
      <dc:creator>HadiBrais</dc:creator>
      <dc:date>2019-04-22T14:50:44Z</dc:date>
    </item>
    <item>
      <title>It is very important to be</title>
      <link>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130718#M6404</link>
      <description>&lt;P&gt;It is very important to be precise when asking questions in these forums....&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;I don't know what a "Xeon 6261 CPU" is....&amp;nbsp; Can you get the correct model number from the output of either "lscpu" or "cat /proc/meminfo" ?&lt;/LI&gt;&lt;LI&gt;The specific command used to launch "perf stat" is critical -- please include both the command used to invoke "perf stat" and the full output.&lt;UL&gt;&lt;LI&gt;It is always a good idea to avoid counter multiplexing when you are getting started with performance counters.&amp;nbsp; You can usually collect 4 core performance counters without multiplexing.&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;LI&gt;Some features of "perf stat" vary from one OS release to another.&amp;nbsp; The output of "uname -a" is usually sufficient.&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;It is also important to understand that your operating system is doing a lot of "stuff" in the background.&amp;nbsp; Some of that "stuff" leaks into the counts for the user program.&amp;nbsp;&amp;nbsp; In your output, cache-misses, cache-references, LLC-load-misses, LLC-loads, LLC-store-misses, and LLC-store are all extremely small -- the largest is 100,000x smaller than the cycle or instruction counts.&amp;nbsp;&amp;nbsp; Operating system interference at this level is (for practical purposes) unavoidable and negligible.&amp;nbsp;&amp;nbsp; The value for mem-stores is rather high, but without precise details of what you are counting and how you are counting it, it would just be a waste of time to spend time trying to understand it....&lt;/P&gt;</description>
      <pubDate>Mon, 22 Apr 2019 21:46:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Software-Tuning-Performance/why-simplest-code-can-cause-high-cache-references-mem-stores/m-p/1130718#M6404</guid>
      <dc:creator>McCalpinJohn</dc:creator>
      <dc:date>2019-04-22T21:46:06Z</dc:date>
    </item>
  </channel>
</rss>

