<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic No they should not overlap. in Analyzers</title>
    <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163633#M17803</link>
    <description>&lt;P&gt;No they should not overlap. The Frontend Bound + Bad Speculation + Backend bound + Retiring will still be 100%.&amp;nbsp; Just some of the stalled pipeline slots will be classified as due to Frontend instead of Bad Speculation. But the "Branch Resteers" node under Frontend will hint that the reason is actually branch mispredicts.&lt;/P&gt;</description>
    <pubDate>Tue, 03 Apr 2018 07:13:37 GMT</pubDate>
    <dc:creator>Dmitry_R_Intel1</dc:creator>
    <dc:date>2018-04-03T07:13:37Z</dc:date>
    <item>
      <title>Frontend Bound and Branch Mispredicts will overlap?</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163630#M17800</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px;"&gt;If a program has many branches, and speculation is very bad. Will it show big percentage both in&amp;nbsp;&lt;/SPAN&gt;Frontend Bound and Branch Mispredicts?&lt;/P&gt;

&lt;P&gt;Because I found Frontend will count unused slots to RAT and Bad Speculation will count recovery cycle. If a program has so many branches mispredict, it will cause many unused slots, at the same time, it will cause the big count of the recovery cycle. Will they overlap?&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thank you:)&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2018 07:33:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163630#M17800</guid>
      <dc:creator>Zhiwei_C_</dc:creator>
      <dc:date>2018-04-02T07:33:55Z</dc:date>
    </item>
    <item>
      <title>Hi,</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163631#M17801</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Yes it is possible for the Frontend metric to be quite high for the code with a lot of mispredict branches.&lt;/P&gt;

&lt;P&gt;Specifically VTune has "Branch Resteers" metric under FE Bound&amp;nbsp;to account for this. Let me copy-past the metric description:&lt;/P&gt;

&lt;P&gt;"... Branch Resteers estimates the Frontend delay in fetching operations from corrected path, following all sorts of misspredicted branches. For example, branchy code with lots of misspredictions might get categorized under Branch Resteers...."&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2018 10:46:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163631#M17801</guid>
      <dc:creator>Dmitry_R_Intel1</dc:creator>
      <dc:date>2018-04-02T10:46:44Z</dc:date>
    </item>
    <item>
      <title>Quote:Dmitry Ryabtsev (Intel)</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163632#M17802</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Dmitry Ryabtsev (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Yes it is possible for the Frontend metric to be quite high for the code with a lot of mispredict branches.&lt;/P&gt;

&lt;P&gt;Specifically VTune has "Branch Resteers" metric under FE Bound&amp;nbsp;to account for this. Let me copy-past the metric description:&lt;/P&gt;

&lt;P&gt;"... Branch Resteers estimates the Frontend delay in fetching operations from corrected path, following all sorts of misspredicted branches. For example, branchy code with lots of misspredictions might get categorized under Branch Resteers...."&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;As far as know, Frontend Bound + Bad Speculation + Backend bound + Retiring = 100%. If&amp;nbsp;&amp;nbsp;&lt;SPAN style="font-size: 13.008px;"&gt;Frontend Bound and&amp;nbsp;Bad Speculation overlap, will make other percentages too small to be not unreliable?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;IDQ_UOPS_NOT_DELIVERED.CORE / (4*CPU_CLK_UNHALTED.THREAD)&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="zh-CN" style="font-weight:
bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;（&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;U&lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;OPS_ISSUED.ANY&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:Calibri"&gt; &lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;- &lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;UOPS_RETIRED.RETIRE_SLOTS&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt; + 4 *&amp;nbsp;&lt;/SPAN&gt;INT_MISC.RECOVERY_CYCLES​&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;）&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt; / &lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;SLOTS&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;Will IDQ_UOPS_NOT_DELIVERED.CORE count when&amp;nbsp;INT_MISC.RECOVERY_CYCLES happen?If yes, they must overlap.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 02 Apr 2018 11:33:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163632#M17802</guid>
      <dc:creator>Zhiwei_C_</dc:creator>
      <dc:date>2018-04-02T11:33:25Z</dc:date>
    </item>
    <item>
      <title>No they should not overlap.</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163633#M17803</link>
      <description>&lt;P&gt;No they should not overlap. The Frontend Bound + Bad Speculation + Backend bound + Retiring will still be 100%.&amp;nbsp; Just some of the stalled pipeline slots will be classified as due to Frontend instead of Bad Speculation. But the "Branch Resteers" node under Frontend will hint that the reason is actually branch mispredicts.&lt;/P&gt;</description>
      <pubDate>Tue, 03 Apr 2018 07:13:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163633#M17803</guid>
      <dc:creator>Dmitry_R_Intel1</dc:creator>
      <dc:date>2018-04-03T07:13:37Z</dc:date>
    </item>
    <item>
      <title>Quote:Dmitry Ryabtsev (Intel)</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163634#M17804</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Dmitry Ryabtsev (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;No they should not overlap. The Frontend Bound + Bad Speculation + Backend bound + Retiring will still be 100%.&amp;nbsp; Just some of the stalled pipeline slots will be classified as due to Frontend instead of Bad Speculation. But the "Branch Resteers" node under Frontend will hint that the reason is actually branch mispredicts.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;OK, you said "&lt;SPAN style="font-size: 13.008px;"&gt;some of the stalled pipeline slots will be classified as due to Frontend instead of Bad Speculation" I can understand. Because the recover must cause many cycles that Frontend issue 0 slot to Backend. And it will reflect on the "Branch Resteers".&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;But, "Bad Speculation" =&amp;nbsp; &lt;STRONG&gt;&lt;SPAN lang="zh-CN" style="font-weight:
bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;（&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;U&lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;OPS_ISSUED.ANY&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:Calibri"&gt; &lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;- &lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;UOPS_RETIRED.RETIRE_SLOTS&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt; + 4 * &lt;/SPAN&gt;INT_MISC.RECOVERY_CYCLES &lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;）&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt; / &lt;/SPAN&gt;&lt;SPAN lang="zh-CN" style="font-weight:bold;font-family:&amp;quot;Microsoft YaHei UI&amp;quot;"&gt;SLOTS.&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;INT_MISC.RECOVERY_CYCLES the explanation is "&lt;SPAN lang="en-US"&gt;C&lt;/SPAN&gt;&lt;SPAN lang="x-none"&gt;ore cycles the Resource allocator was stalled due to&lt;/SPAN&gt;&lt;SPAN lang="en-US"&gt; &lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold"&gt;recovery from an earlier branch misprediction or machine clear event"&lt;/SPAN&gt;&lt;SPAN lang="en-US"&gt;.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;The "&lt;SPAN style="font-size: 13.008px;"&gt;Resource allocator stalled cycles" mean the cycles form mispredicted branch instruction flushed to this branch instruction take in RS, so ti must include the cycles that Frontend resteer&amp;nbsp;this instruction. I don't know if I understand it correctly.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;I tested "&lt;/SPAN&gt;int_misc_recovery_cycles " and "int_misc_clear_resteer_cycles " counter in some cases, found&amp;nbsp;&lt;SPAN style="font-size: 13.008px;"&gt;"int_misc_recovery_cycles " alway biger than&amp;nbsp;"int_misc_clear_resteer_cycles ". So I think&amp;nbsp;&amp;nbsp;"int_misc_recovery_cycles "&amp;nbsp; will overlap Frontend stall cycles, that is "Bad Speculation" will overlap "Frontend Bound". Rather than "&amp;nbsp;stalled pipeline slots will be classified as due to Frontend instead of Bad Speculation", it will&amp;nbsp;be classified both&amp;nbsp;&amp;nbsp;Frontend&amp;nbsp;Bound and Bad Speculation.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P class="tran" style="box-sizing: border-box; margin-bottom: 12px; line-height: 1.5; color: rgb(95, 98, 102); font-family: Arial, &amp;quot;Microsoft YaHei&amp;quot;, 微软雅黑, 宋体, &amp;quot;Malgun Gothic&amp;quot;, Meiryo, sans-serif; font-size: 13px; font-variant-numeric: normal; font-variant-east-asian: normal; background-color: rgb(249, 251, 252);"&gt;I don't know if I made any mistakes.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Apr 2018 01:58:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163634#M17804</guid>
      <dc:creator>Zhiwei_C_</dc:creator>
      <dc:date>2018-04-04T01:58:22Z</dc:date>
    </item>
    <item>
      <title>The  top-level 'Front-End</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163635#M17805</link>
      <description>&lt;P&gt;The&amp;nbsp; top-level 'Front-End Bound' node is based on the IDQ_UOPS_NOT_DELIVERED.CORE event. The description of this event is following: "Uops not delivered to Resource Allocation Table (RAT) per thread &lt;STRONG&gt;when backend of the machine is not stalled&lt;/STRONG&gt;". I think the "when backend of the machine is not stalled"&amp;nbsp;is what makes these metrics not overlap.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Apr 2018 06:29:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163635#M17805</guid>
      <dc:creator>Dmitry_R_Intel1</dc:creator>
      <dc:date>2018-04-04T06:29:50Z</dc:date>
    </item>
    <item>
      <title>Quote:Dmitry Ryabtsev (Intel)</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163636#M17806</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Dmitry Ryabtsev (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;The&amp;nbsp; top-level 'Front-End Bound' node is based on the IDQ_UOPS_NOT_DELIVERED.CORE event. The description of this event is following: "Uops not delivered to Resource Allocation Table (RAT) per thread &lt;STRONG&gt;when backend of the machine is not stalled&lt;/STRONG&gt;". I think the "when backend of the machine is not stalled"&amp;nbsp;is what makes these metrics not overlap.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I know that,&amp;nbsp;&lt;SPAN lang="zh-CN" style="font-family:&amp;quot;Microsoft YaHei UI&amp;quot;;font-size:11.0pt"&gt;IDQ_UOPS_NOT_DELIVERED.CORE&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-family:Calibri;font-size:11.0pt"&gt;: "&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-family:Calibri;font-size:12.0pt"&gt;Count &lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-weight:bold;font-family:Calibri;font-size:12.0pt"&gt;issue pipeline slots&lt;/SPAN&gt;&lt;SPAN lang="en-US" style="font-family:Calibri;font-size:12.0pt"&gt; where no uop was delivered from the front end to the back end when there is no back-end stall. "&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Now the question is, whether the "back-end stall" include recover cycles.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;In the "64-ia-32-architectures-optimization-manual", the description of the Front-end Bottleneck is following: "&amp;nbsp;Front-end bottleneck occurs when front-end of the machine is not delivering uops to the back-end and the band-end is not stalled.Cycles where the &lt;STRONG&gt;back-end is not ready to accept micro-ops from the frontend&lt;/STRONG&gt; should not be counted as front-end bottlenecks even though such back-end bottlenecks will cause allocation unit stalls, eventually forcing the front-end to wait until the back-end is ready to receive more uops."&lt;/P&gt;

&lt;P&gt;And in the paper "&lt;SPAN class="fontstyle0"&gt;A Top-Down Method for Performance Analysis and Counters Architecture&lt;/SPAN&gt;", mentioned "A backend-stall is a backpressure mechanism the Backend asserts upon resource unavailability (e.g. lack of load buffer entries).".&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;So, I think when recovering&amp;nbsp;happen, bcakend&amp;nbsp;surely can accept uops from the frontend. That is,&amp;nbsp;&lt;SPAN style="font-size: 13.008px;"&gt;IDQ_UOPS_NOT_DELIVERED.CORE will countinue to count.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 04 Apr 2018 07:35:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163636#M17806</guid>
      <dc:creator>Zhiwei_C_</dc:creator>
      <dc:date>2018-04-04T07:35:47Z</dc:date>
    </item>
    <item>
      <title>Well I agree that this is</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163637#M17807</link>
      <description>&lt;P&gt;Well I agree that this is ambiguous and the documentation on&amp;nbsp;events&amp;nbsp;is not sufficient to get a definite answer. So we probably need&amp;nbsp;someone who&amp;nbsp;knows well&amp;nbsp;the PMU internals (I'll try to reach such people but it may take time).&lt;/P&gt;

&lt;P&gt;Still I think it is quite probable that IDQ_UOPS_NOT_DELIVERED.CORE is not incremented during recovery. And the current formulas in Top-Down seem to assume this.&lt;/P&gt;</description>
      <pubDate>Wed, 04 Apr 2018 08:43:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163637#M17807</guid>
      <dc:creator>Dmitry_R_Intel1</dc:creator>
      <dc:date>2018-04-04T08:43:15Z</dc:date>
    </item>
    <item>
      <title>Quote:Dmitry Ryabtsev (Intel)</title>
      <link>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163638#M17808</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Dmitry Ryabtsev (Intel) wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Well I agree that this is ambiguous and the documentation on&amp;nbsp;events&amp;nbsp;is not sufficient to get a definite answer. So we probably need&amp;nbsp;someone who&amp;nbsp;knows well&amp;nbsp;the PMU internals (I'll try to reach such people but it may take time).&lt;/P&gt;

&lt;P&gt;Still I think it is quite probable that IDQ_UOPS_NOT_DELIVERED.CORE is not incremented during recovery. And the current formulas in Top-Down seem to assume this.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Ok, I'll wait for your reply.&lt;/P&gt;

&lt;P&gt;Thank you for your patiently reply:)&lt;/P&gt;</description>
      <pubDate>Wed, 04 Apr 2018 10:03:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Analyzers/Frontend-Bound-and-Branch-Mispredicts-will-overlap/m-p/1163638#M17808</guid>
      <dc:creator>Zhiwei_C_</dc:creator>
      <dc:date>2018-04-04T10:03:30Z</dc:date>
    </item>
  </channel>
</rss>

