<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Dear Xin, in OpenCL* for CPU</title>
    <link>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060053#M4191</link>
    <description>&lt;P&gt;Dear Xin,&lt;/P&gt;

&lt;P&gt;We just recently published a sample: &lt;A href="https://software.intel.com/en-us/articles/sgemm-for-intel-processor-graphics"&gt;https://software.intel.com/en-us/articles/sgemm-for-intel-processor-graphics&lt;/A&gt; on how to do SGEMM on Intel Processor Graphics. Unfortunately, we don't have a full-blown BLAS library optimized for it yet.&lt;/P&gt;</description>
    <pubDate>Wed, 26 Aug 2015 17:54:53 GMT</pubDate>
    <dc:creator>Robert_I_Intel</dc:creator>
    <dc:date>2015-08-26T17:54:53Z</dc:date>
    <item>
      <title>kernel “vector + vector”, return the right result only if vector's length is a multiple of 64</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060050#M4188</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I'm new to OpenCL. And I'm trying to run a kernel “vector + vector”, I could get the right result only if vector's length equals &amp;nbsp;a multiple of 64. For example, I will get the output below when I set the length to 16.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;No protocol specified&lt;BR /&gt;
	platform 1: vendor 'Intel(R) Corporation'&lt;BR /&gt;
	&amp;nbsp;device 0: 'Intel(R) HD Graphics'&lt;BR /&gt;
	0 + 16 = 0&lt;BR /&gt;
	1 + 15 = 0&lt;BR /&gt;
	2 + 14 = 0&lt;BR /&gt;
	3 + 13 = 0&lt;BR /&gt;
	4 + 12 = 0&lt;BR /&gt;
	5 + 11 = 0&lt;BR /&gt;
	6 + 10 = 0&lt;BR /&gt;
	7 + 9 = 0&lt;BR /&gt;
	8 + 8 = 0&lt;BR /&gt;
	9 + 7 = 0&lt;BR /&gt;
	10 + 6 = 0&lt;BR /&gt;
	11 + 5 = 0&lt;BR /&gt;
	12 + 4 = 0&lt;BR /&gt;
	13 + 3 = 0&lt;BR /&gt;
	14 + 2 = 0&lt;BR /&gt;
	15 + 1 = 0&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;You can find the code from this website&amp;nbsp;http://www.eriksmistad.no/getting-started-with-opencl-and-gpu-computing/&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Environment：&lt;/SPAN&gt;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;CentOS 7.1&lt;/LI&gt;
	&lt;LI&gt;i7 4790&lt;/LI&gt;
	&lt;LI&gt;OpenCL 1.2&lt;/LI&gt;
	&lt;LI&gt;SDK: Intel SDK &amp;nbsp;2015 Production16.4.2.1 from Intel Media Server Studio Community version.&lt;/LI&gt;
&lt;/UL&gt;</description>
      <pubDate>Tue, 25 Aug 2015 02:46:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060050#M4188</guid>
      <dc:creator>Xin_Q_Intel</dc:creator>
      <dc:date>2015-08-25T02:46:36Z</dc:date>
    </item>
    <item>
      <title>Dear Xin,</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060051#M4189</link>
      <description>&lt;P&gt;Dear Xin,&lt;/P&gt;

&lt;P&gt;The code in question has a couple of defects:&lt;/P&gt;

&lt;P&gt;1. It does not check whether return code ret is actually a success: if it did, your program would terminate at line 84 (while attempting to call clEnqueueNDRangeKernel), since your global size (16) is less than your local size (64).&lt;/P&gt;

&lt;P&gt;2. If you correct the program as follows: set local_item_size to 16, 8, 4, 2 or 1, the program will perform correctly.&lt;/P&gt;

&lt;P&gt;3. Alternatively, you could provide 0 instead of &amp;amp;local_item_size parameter and let the runtime pick the local size for you.&lt;/P&gt;

&lt;P&gt;Anyway, the code in question will not perform very well on Intel(R) Processor Graphics. Please see the following article for a better example:&lt;/P&gt;

&lt;P&gt;&lt;A href="https://software.intel.com/en-us/articles/getting-the-most-from-opencl-12-how-to-increase-performance-by-minimizing-buffer-copies-on-intel-processor-graphics"&gt;https://software.intel.com/en-us/articles/getting-the-most-from-opencl-12-how-to-increase-performance-by-minimizing-buffer-copies-on-intel-processor-graphics&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 25 Aug 2015 21:56:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060051#M4189</guid>
      <dc:creator>Robert_I_Intel</dc:creator>
      <dc:date>2015-08-25T21:56:32Z</dc:date>
    </item>
    <item>
      <title>Dear Robert, </title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060052#M4190</link>
      <description>&lt;P&gt;Dear Robert,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks for your help, it perform correctly now.&lt;/P&gt;

&lt;P&gt;BTW, I want to compute some matrix using OpenCL, do you know a BLAS library running well on our Intel(R) Processor Graphics? I have tried AMD's clBLAS, but the performance is quite bad.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Aug 2015 02:23:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060052#M4190</guid>
      <dc:creator>Xin_Q_Intel</dc:creator>
      <dc:date>2015-08-26T02:23:39Z</dc:date>
    </item>
    <item>
      <title>Dear Xin,</title>
      <link>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060053#M4191</link>
      <description>&lt;P&gt;Dear Xin,&lt;/P&gt;

&lt;P&gt;We just recently published a sample: &lt;A href="https://software.intel.com/en-us/articles/sgemm-for-intel-processor-graphics"&gt;https://software.intel.com/en-us/articles/sgemm-for-intel-processor-graphics&lt;/A&gt; on how to do SGEMM on Intel Processor Graphics. Unfortunately, we don't have a full-blown BLAS library optimized for it yet.&lt;/P&gt;</description>
      <pubDate>Wed, 26 Aug 2015 17:54:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/OpenCL-for-CPU/kernel-vector-vector-return-the-right-result-only-if-vector-s/m-p/1060053#M4191</guid>
      <dc:creator>Robert_I_Intel</dc:creator>
      <dc:date>2015-08-26T17:54:53Z</dc:date>
    </item>
  </channel>
</rss>

