<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Eug, in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173485#M298</link>
    <description>&lt;P&gt;Hi Eug,&lt;/P&gt;&lt;P&gt;I tried the same code which you have provided and got the same error of SEGFAULT while using gpu_selector in q2 queue. And after multiple&amp;nbsp;compilation and execution, there was some iteration where we are not getting SEGFAULT error.&lt;/P&gt;&lt;P&gt;So the thing here is that when you run the program its executes synchronously on the host side and when you launch a queue, the command&amp;nbsp;queue&amp;nbsp;submits&amp;nbsp;the&amp;nbsp;command&amp;nbsp;group&amp;nbsp;inside&amp;nbsp;it&amp;nbsp;asynchronously.&amp;nbsp;Thus, if we use q.wait() after the completion of the earlier queue then it will not give SEGFAULT in the next upcoming queue. So if&amp;nbsp;there&amp;nbsp;are&amp;nbsp;no&amp;nbsp;dependencies&amp;nbsp;such&amp;nbsp;as&amp;nbsp;memory&amp;nbsp;objects&amp;nbsp;(buffers)&amp;nbsp;or&amp;nbsp;other kernels,&amp;nbsp;the&amp;nbsp;program&amp;nbsp;control&amp;nbsp;will&amp;nbsp;be&amp;nbsp;returned&amp;nbsp;back&amp;nbsp;to&amp;nbsp;the&amp;nbsp;host&amp;nbsp;device before&amp;nbsp;going into other queues.&lt;/P&gt;&lt;P&gt;You can see the below code that I&amp;nbsp;tried and it is working with every combination of the device selector.:&lt;/P&gt;
&lt;PRE class="brush:cpp; class-name:dark;"&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;CL/sycl.hpp&amp;gt;
#include &amp;lt;dpstd/execution&amp;gt;
#include &amp;lt;dpstd/algorithm&amp;gt;
#include &amp;lt;dpstd/iterators.h&amp;gt;
using namespace sycl;

int main(int argc, char **argv) {

   cl::sycl::queue q(gpu_selector{});
   cl::sycl::queue q2(cpu_selector{});

   auto policy = dpstd::execution::make_device_policy&amp;lt;class Fill&amp;gt;( q );
   auto policy2 = dpstd::execution::make_device_policy&amp;lt;class Fill2&amp;gt;( q2 );

   const int n = 10000;

   buffer&amp;lt;int&amp;gt; vals_buf{n};
   auto vals_begin = dpstd::begin(vals_buf);
   auto counting_begin = dpstd::counting_iterator&amp;lt;int&amp;gt;{0};
   std::transform(policy, counting_begin, counting_begin + n, vals_begin,&lt;N&gt;(int i) { return n - (i / 2) * 2; });

   std::sort(policy,vals_begin, vals_begin + n);
   std::cout&amp;lt;&amp;lt;q.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;

   q.wait();


   cl::sycl::buffer&amp;lt;int&amp;gt; buf2 { 1000 };
   auto buf_begin2 = dpstd::begin(buf2);
   auto buf_end2   = dpstd::end(buf2);

   std::fill(policy2, buf_begin2, buf_end2, 42);
   std::cout&amp;lt;&amp;lt;q2.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;
}
&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;Please go through the code and let us know if you still face the same issue.&lt;/P&gt;
&lt;P&gt;I have also attached the screenshot of the output for more details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warm Regards,&lt;/P&gt;
&lt;P&gt;Abhishek&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 29 Apr 2020 12:11:16 GMT</pubDate>
    <dc:creator>AbhishekD_Intel</dc:creator>
    <dc:date>2020-04-29T12:11:16Z</dc:date>
    <item>
      <title>Segmentation fault sort function on gpu selector</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173484#M297</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;I'm trying to understand how I can execute different algorithms on different devices.&lt;/P&gt;&lt;P&gt;I have a very simple program divided in 2 independent parts: the first one fills a buffer&lt;/P&gt;&lt;P&gt;and then sort it (on cpu device);&lt;/P&gt;&lt;P&gt;the second one just&amp;nbsp;fills a new&amp;nbsp;buffer on a&amp;nbsp;GPU&amp;nbsp;device.&lt;/P&gt;&lt;P&gt;The 2 parts are in the same&amp;nbsp;source code.&lt;/P&gt;&lt;P&gt;The problem is a segmentation fault on GPU code if I execute the first part on cpu device&lt;/P&gt;&lt;P&gt;and the second one on GPU.&lt;/P&gt;&lt;P&gt;If I execute both on cpu, everything works (or at least there is no segmentation fault).&lt;/P&gt;&lt;P&gt;If I&amp;nbsp;remove the sort function and execute both on GPU, it works.&lt;/P&gt;&lt;P&gt;It is a very useless example, it's just to understand how differents&lt;/P&gt;&lt;P&gt;things work.&lt;/P&gt;&lt;P&gt;I executed it on DevCloud.&lt;/P&gt;
&lt;PRE class="brush:cpp; class-name:dark;"&gt;
int main(int argc, char **argv) {

   cl::sycl::queue q(cpu_selector{});

   const int n = 10000;

   buffer&amp;lt;int&amp;gt; vals_buf{n}; 
	
   auto vals_begin = dpstd::begin(vals_buf);
	
   auto counting_begin = dpstd::counting_iterator&amp;lt;int&amp;gt;{0};
	 
   auto policy = dpstd::execution::make_device_policy&amp;lt;class Fill&amp;gt;( q );
	
   std::transform(policy, counting_begin, counting_begin + n, vals_begin,&lt;N&gt;(int i) { return n - (i / 2) * 2; });
	 
   std::sort(policy,vals_begin, vals_begin + n);
			  
   std::cout&amp;lt;&amp;lt;q.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;

   cl::sycl::queue q2(gpu_selector{});

   cl::sycl::buffer&amp;lt;int&amp;gt; buf2 { 1000 };
   auto buf_begin2 = dpstd::begin(buf2);
   auto buf_end2   = dpstd::end(buf2);
   auto policy2 = dpstd::execution::make_device_policy&amp;lt;class Fill2&amp;gt;( q2 );
	
   std::fill(policy2, buf_begin2, buf_end2, 42);
   std::cout&amp;lt;&amp;lt;q2.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;
}&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 28 Apr 2020 10:45:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173484#M297</guid>
      <dc:creator>eug</dc:creator>
      <dc:date>2020-04-28T10:45:26Z</dc:date>
    </item>
    <item>
      <title>Hi Eug,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173485#M298</link>
      <description>&lt;P&gt;Hi Eug,&lt;/P&gt;&lt;P&gt;I tried the same code which you have provided and got the same error of SEGFAULT while using gpu_selector in q2 queue. And after multiple&amp;nbsp;compilation and execution, there was some iteration where we are not getting SEGFAULT error.&lt;/P&gt;&lt;P&gt;So the thing here is that when you run the program its executes synchronously on the host side and when you launch a queue, the command&amp;nbsp;queue&amp;nbsp;submits&amp;nbsp;the&amp;nbsp;command&amp;nbsp;group&amp;nbsp;inside&amp;nbsp;it&amp;nbsp;asynchronously.&amp;nbsp;Thus, if we use q.wait() after the completion of the earlier queue then it will not give SEGFAULT in the next upcoming queue. So if&amp;nbsp;there&amp;nbsp;are&amp;nbsp;no&amp;nbsp;dependencies&amp;nbsp;such&amp;nbsp;as&amp;nbsp;memory&amp;nbsp;objects&amp;nbsp;(buffers)&amp;nbsp;or&amp;nbsp;other kernels,&amp;nbsp;the&amp;nbsp;program&amp;nbsp;control&amp;nbsp;will&amp;nbsp;be&amp;nbsp;returned&amp;nbsp;back&amp;nbsp;to&amp;nbsp;the&amp;nbsp;host&amp;nbsp;device before&amp;nbsp;going into other queues.&lt;/P&gt;&lt;P&gt;You can see the below code that I&amp;nbsp;tried and it is working with every combination of the device selector.:&lt;/P&gt;
&lt;PRE class="brush:cpp; class-name:dark;"&gt;#include &amp;lt;iostream&amp;gt;
#include &amp;lt;CL/sycl.hpp&amp;gt;
#include &amp;lt;dpstd/execution&amp;gt;
#include &amp;lt;dpstd/algorithm&amp;gt;
#include &amp;lt;dpstd/iterators.h&amp;gt;
using namespace sycl;

int main(int argc, char **argv) {

   cl::sycl::queue q(gpu_selector{});
   cl::sycl::queue q2(cpu_selector{});

   auto policy = dpstd::execution::make_device_policy&amp;lt;class Fill&amp;gt;( q );
   auto policy2 = dpstd::execution::make_device_policy&amp;lt;class Fill2&amp;gt;( q2 );

   const int n = 10000;

   buffer&amp;lt;int&amp;gt; vals_buf{n};
   auto vals_begin = dpstd::begin(vals_buf);
   auto counting_begin = dpstd::counting_iterator&amp;lt;int&amp;gt;{0};
   std::transform(policy, counting_begin, counting_begin + n, vals_begin,&lt;N&gt;(int i) { return n - (i / 2) * 2; });

   std::sort(policy,vals_begin, vals_begin + n);
   std::cout&amp;lt;&amp;lt;q.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;

   q.wait();


   cl::sycl::buffer&amp;lt;int&amp;gt; buf2 { 1000 };
   auto buf_begin2 = dpstd::begin(buf2);
   auto buf_end2   = dpstd::end(buf2);

   std::fill(policy2, buf_begin2, buf_end2, 42);
   std::cout&amp;lt;&amp;lt;q2.get_device().get_info&amp;lt;info::device::name&amp;gt;() &amp;lt;&amp;lt; std::endl;
}
&lt;/N&gt;&lt;/PRE&gt;

&lt;P&gt;Please go through the code and let us know if you still face the same issue.&lt;/P&gt;
&lt;P&gt;I have also attached the screenshot of the output for more details.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Warm Regards,&lt;/P&gt;
&lt;P&gt;Abhishek&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 29 Apr 2020 12:11:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173485#M298</guid>
      <dc:creator>AbhishekD_Intel</dc:creator>
      <dc:date>2020-04-29T12:11:16Z</dc:date>
    </item>
    <item>
      <title>Thank you, now it works but</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173486#M299</link>
      <description>&lt;P&gt;Thank you, now&amp;nbsp;it works but if I compile without debug flag.&lt;/P&gt;&lt;P&gt;The same problem still happens if I compile with "-g flag".&lt;/P&gt;</description>
      <pubDate>Wed, 29 Apr 2020 17:03:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173486#M299</guid>
      <dc:creator>eug</dc:creator>
      <dc:date>2020-04-29T17:03:55Z</dc:date>
    </item>
    <item>
      <title>Hi,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173487#M300</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;We are also getting the same error while using -g flag and we have escalated it to the concerned team.&lt;/P&gt;&lt;P&gt;Soon you will get a reply from them.&lt;/P&gt;&lt;P&gt;Thank you for your findings.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Warm Regards,&lt;/P&gt;&lt;P&gt;Abhishek&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 09:24:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173487#M300</guid>
      <dc:creator>AbhishekD_Intel</dc:creator>
      <dc:date>2020-05-05T09:24:34Z</dc:date>
    </item>
    <item>
      <title>Hi Eug,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173488#M301</link>
      <description>&lt;P&gt;Hi Eug,&lt;/P&gt;&lt;P&gt;There was a known issue in the GPU driver causing SEGFAULT when using -g flag and this is fixed in its latest version.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I can no longer reproduce this issue even on DevCloud which now has the latest version of the driver.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Sravani&lt;/P&gt;</description>
      <pubDate>Wed, 06 May 2020 22:21:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Segmentation-fault-sort-function-on-gpu-selector/m-p/1173488#M301</guid>
      <dc:creator>Sravani_K_Intel</dc:creator>
      <dc:date>2020-05-06T22:21:17Z</dc:date>
    </item>
  </channel>
</rss>

