<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hello, on the GPU side at in OpenCL* for CPU</title>
    <link>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157898#M6230</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;On the GPU side at least, I think we've fixed this problem with our latest internal drivers.&amp;nbsp; This won't help you right now since the optimization isn't in our latest public drivers, but it will be in the latest major driver release - stay tuned.&lt;/P&gt;

&lt;P&gt;In the meantime, recent public drivers for recent GPUs do have an optimization for clCreateKernel, but it requires a slightly different pattern than the one used by your app.&amp;nbsp; Basically, if you measure ( clCreateKernel + clReleaseKernel ) x 10000, vs. just clCreateKernel x 10000, you should see better performance.&amp;nbsp; This is a pattern that we've seen used by OpenCV, for example.&lt;/P&gt;

&lt;P&gt;Note that to see an improvement you may need newer drivers than the ones in your report, which are a bit old.&amp;nbsp; I tested on an HD Graphics 5500 device with drivers 20.19.15.4463, if that helps.&lt;/P&gt;</description>
    <pubDate>Thu, 16 Nov 2017 21:05:23 GMT</pubDate>
    <dc:creator>Ben_A_Intel</dc:creator>
    <dc:date>2017-11-16T21:05:23Z</dc:date>
    <item>
      <title>Performance issue with clCreateKernel()</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157897#M6229</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I am experiencing performance issues with Intel OpenCL drivers when calling clCreateKernel().&lt;BR /&gt;
	Attached you can find a test code (using Boost Compute) with reference values taken from various OpenCL devices.&lt;BR /&gt;
	Here, the Intel drivers show by far the worst results by a factor of 10 to 600 compared to other vendors.&lt;BR /&gt;
	Please let me know which further details you need, or where I can file a bug report, to get this fixed.&lt;/P&gt;

&lt;P&gt;Best regards&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 15:52:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157897#M6229</guid>
      <dc:creator>dstarke</dc:creator>
      <dc:date>2017-11-16T15:52:15Z</dc:date>
    </item>
    <item>
      <title>Hello, on the GPU side at</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157898#M6230</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;On the GPU side at least, I think we've fixed this problem with our latest internal drivers.&amp;nbsp; This won't help you right now since the optimization isn't in our latest public drivers, but it will be in the latest major driver release - stay tuned.&lt;/P&gt;

&lt;P&gt;In the meantime, recent public drivers for recent GPUs do have an optimization for clCreateKernel, but it requires a slightly different pattern than the one used by your app.&amp;nbsp; Basically, if you measure ( clCreateKernel + clReleaseKernel ) x 10000, vs. just clCreateKernel x 10000, you should see better performance.&amp;nbsp; This is a pattern that we've seen used by OpenCV, for example.&lt;/P&gt;

&lt;P&gt;Note that to see an improvement you may need newer drivers than the ones in your report, which are a bit old.&amp;nbsp; I tested on an HD Graphics 5500 device with drivers 20.19.15.4463, if that helps.&lt;/P&gt;</description>
      <pubDate>Thu, 16 Nov 2017 21:05:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157898#M6230</guid>
      <dc:creator>Ben_A_Intel</dc:creator>
      <dc:date>2017-11-16T21:05:23Z</dc:date>
    </item>
    <item>
      <title>Hello,</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157899#M6231</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;Thank you for the information regarding the GPU drivers.&lt;BR /&gt;
	How about the CPU side? I am actually more interested in this. Especially the Xeon did show really bad results on this end, compared for example to open source CPU backend Oclgrind.&lt;/P&gt;</description>
      <pubDate>Fri, 17 Nov 2017 05:43:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157899#M6231</guid>
      <dc:creator>dstarke</dc:creator>
      <dc:date>2017-11-17T05:43:51Z</dc:date>
    </item>
    <item>
      <title>Hello,</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157900#M6232</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;So what is the status / roadmap on this issue for the CPU?&lt;BR /&gt;
	My kernel in question is spending more than 10% of its execution time in the clCreateKernel function on the Xeon CPU with the current driver version.&lt;/P&gt;</description>
      <pubDate>Sat, 25 Nov 2017 12:58:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/Performance-issue-with-clCreateKernel/m-p/1157900#M6232</guid>
      <dc:creator>dstarke</dc:creator>
      <dc:date>2017-11-25T12:58:58Z</dc:date>
    </item>
  </channel>
</rss>

