<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic The CPU implementation is in OpenCL* for CPU</title>
    <link>https://community.intel.com/t5/OpenCL-for-CPU/CPU-as-OpenCL-device-running-in-a-sperated-process/m-p/1117743#M5431</link>
    <description>&lt;P&gt;The CPU implementation is automatically parallelized by Intel Threading Building Blocks (TBB). &amp;nbsp;This is one of the advantages of using it -- you get access to the sophisticated multi-threading capabilities of this rich library for free.&lt;/P&gt;

&lt;P&gt;If you run the CapsBasic sample (platform/device capabilities viewer) you will see something like this for your OpenCL CPU implementation:&lt;/P&gt;

&lt;P&gt;CL_DEVICE_TYPE_CPU[0]&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_NAME: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_AVAILABLE: 1&lt;BR /&gt;
	&amp;nbsp;...&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_MAX_COMPUTE_UNITS: 4&lt;/P&gt;

&lt;P&gt;For this processor, it means OpenCL will schedule across the 4 CPU cores by default.&lt;/P&gt;

&lt;P&gt;For the CPU implementation it is possible to use only a subset of cores through "device fission".&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/opencl-device-fission-for-cpu-performance" target="_blank"&gt;https://software.intel.com/en-us/articles/opencl-device-fission-for-cpu-performance&lt;/A&gt;.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Of course another option to have more control over which cores are used is to just move the kernel code into TBB or OpenMP instead.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sun, 04 Sep 2016 20:54:48 GMT</pubDate>
    <dc:creator>Jeffrey_M_Intel1</dc:creator>
    <dc:date>2016-09-04T20:54:48Z</dc:date>
    <item>
      <title>CPU as OpenCL device running in a sperated process?</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/CPU-as-OpenCL-device-running-in-a-sperated-process/m-p/1117742#M5430</link>
      <description>&lt;P&gt;I run a program (process A) on my Intel Xeon CPU E5-1620 v2 with two threads. One thread (1) starts an OpenCL application, that uses the CPU as device the other (2) does some calculations.&lt;/P&gt;

&lt;P&gt;I noticed that the performance of thread 2, suffers from the OpenCL application execution of thread 1.&lt;/P&gt;

&lt;P&gt;So I concluded, that the OpenCL application run by thread 1&amp;nbsp; starts a new process on the CPU (process B) and that process A and B get scheduled by the operating system. Because of this the performance of thread 2 suffers.&lt;/P&gt;

&lt;P&gt;I could not find any documentation, that confirms my conclusion.&lt;/P&gt;

&lt;P&gt;Is conclusion correct and more important, is there a documentation about it?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 31 Aug 2016 09:41:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/CPU-as-OpenCL-device-running-in-a-sperated-process/m-p/1117742#M5430</guid>
      <dc:creator>Harald_S_</dc:creator>
      <dc:date>2016-08-31T09:41:07Z</dc:date>
    </item>
    <item>
      <title>The CPU implementation is</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/CPU-as-OpenCL-device-running-in-a-sperated-process/m-p/1117743#M5431</link>
      <description>&lt;P&gt;The CPU implementation is automatically parallelized by Intel Threading Building Blocks (TBB). &amp;nbsp;This is one of the advantages of using it -- you get access to the sophisticated multi-threading capabilities of this rich library for free.&lt;/P&gt;

&lt;P&gt;If you run the CapsBasic sample (platform/device capabilities viewer) you will see something like this for your OpenCL CPU implementation:&lt;/P&gt;

&lt;P&gt;CL_DEVICE_TYPE_CPU[0]&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_NAME: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_AVAILABLE: 1&lt;BR /&gt;
	&amp;nbsp;...&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; CL_DEVICE_MAX_COMPUTE_UNITS: 4&lt;/P&gt;

&lt;P&gt;For this processor, it means OpenCL will schedule across the 4 CPU cores by default.&lt;/P&gt;

&lt;P&gt;For the CPU implementation it is possible to use only a subset of cores through "device fission".&amp;nbsp;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/opencl-device-fission-for-cpu-performance" target="_blank"&gt;https://software.intel.com/en-us/articles/opencl-device-fission-for-cpu-performance&lt;/A&gt;.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Of course another option to have more control over which cores are used is to just move the kernel code into TBB or OpenMP instead.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 04 Sep 2016 20:54:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/CPU-as-OpenCL-device-running-in-a-sperated-process/m-p/1117743#M5431</guid>
      <dc:creator>Jeffrey_M_Intel1</dc:creator>
      <dc:date>2016-09-04T20:54:48Z</dc:date>
    </item>
  </channel>
</rss>

