<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic The conditional mask is by EU in OpenCL* for CPU</title>
    <link>https://community.intel.com/t5/OpenCL-for-CPU/Same-instruction-on-all-8-EU/m-p/1108179#M5242</link>
    <description>&lt;P&gt;The conditional mask is by EU thread. &amp;nbsp;Each thread can have 1-32 SIMD lanes.&lt;/P&gt;

&lt;P&gt;This is lower granularity than by EU. &amp;nbsp;Each EU typically runs 7 threads. &amp;nbsp;The 2 FPUs per EU could in theory be saturated by only 2 threads but in practice running 7 means a higher chance of keeping them busy.&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="fontstyle0"&gt;For more info, please see section 5.3.5 "SIMD Code Generation for SPMD Programming Models" in the Gen9 compute architecture documentation:&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://community.intel.com/legacyfs/online/drupal_files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf"&gt;https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf.&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 09 Dec 2016 09:51:55 GMT</pubDate>
    <dc:creator>Jeffrey_M_Intel1</dc:creator>
    <dc:date>2016-12-09T09:51:55Z</dc:date>
    <item>
      <title>Same instruction on all 8 EU?</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Same-instruction-on-all-8-EU/m-p/1108178#M5241</link>
      <description>&lt;P&gt;To get peak performance, all EU in single sub-slice should issue same instruction or in single EU only we need same instruction? At what granularity i should avoid branching ?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks and regards,&lt;/P&gt;

&lt;P&gt;Biren Doshi&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2016 07:46:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Same-instruction-on-all-8-EU/m-p/1108178#M5241</guid>
      <dc:creator>Biren_Doshi</dc:creator>
      <dc:date>2016-12-08T07:46:12Z</dc:date>
    </item>
    <item>
      <title>The conditional mask is by EU</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Same-instruction-on-all-8-EU/m-p/1108179#M5242</link>
      <description>&lt;P&gt;The conditional mask is by EU thread. &amp;nbsp;Each thread can have 1-32 SIMD lanes.&lt;/P&gt;

&lt;P&gt;This is lower granularity than by EU. &amp;nbsp;Each EU typically runs 7 threads. &amp;nbsp;The 2 FPUs per EU could in theory be saturated by only 2 threads but in practice running 7 means a higher chance of keeping them busy.&lt;/P&gt;

&lt;P&gt;&lt;SPAN class="fontstyle0"&gt;For more info, please see section 5.3.5 "SIMD Code Generation for SPMD Programming Models" in the Gen9 compute architecture documentation:&amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://community.intel.com/legacyfs/online/drupal_files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf"&gt;https://software.intel.com/sites/default/files/managed/c5/9a/The-Compute-Architecture-of-Intel-Processor-Graphics-Gen9-v1d0.pdf.&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Dec 2016 09:51:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Same-instruction-on-all-8-EU/m-p/1108179#M5242</guid>
      <dc:creator>Jeffrey_M_Intel1</dc:creator>
      <dc:date>2016-12-09T09:51:55Z</dc:date>
    </item>
  </channel>
</rss>

