<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Inquiry Regarding Inconsistent Results Despite [[intel::reqd_sub_group_size(32)]] Specification in Migrating to SYCL</title>
    <link>https://community.intel.com/t5/Migrating-to-SYCL/Inquiry-Regarding-Inconsistent-Results-Despite-intel-reqd-sub/m-p/1671812#M331</link>
    <description>&lt;P&gt;Hi, just wondering if you're still seeing the issue ?&amp;nbsp; I tested on a data center gpu max 1100, but couldn't reproduce it...&lt;/P&gt;</description>
    <pubDate>Mon, 03 Mar 2025 21:13:51 GMT</pubDate>
    <dc:creator>yzh_intel</dc:creator>
    <dc:date>2025-03-03T21:13:51Z</dc:date>
    <item>
      <title>Inquiry Regarding Inconsistent Results Despite [[intel::reqd_sub_group_size(32)]] Specification</title>
      <link>https://community.intel.com/t5/Migrating-to-SYCL/Inquiry-Regarding-Inconsistent-Results-Despite-intel-reqd-sub/m-p/1599431#M293</link>
      <description>&lt;P&gt;I recently migrated a CUDA project to SYCL and encountered different results between debug mode and release mode when running in Visual Studio. After investigating, I found that the difference occurs in the "get_sub_group()" function.&lt;/P&gt;&lt;P&gt;Here's a snippet of code I used for testing:&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;FONT&gt;std::cout &amp;lt;&amp;lt; "device name : " &amp;lt;&amp;lt; device.get_name() &amp;lt;&amp;lt; std::endl;//device name: Intel(R) Arc(TM) A370M Graphics&lt;BR /&gt;std::cout &amp;lt;&amp;lt; "Suppose Sub-group Sizes: ";&lt;BR /&gt;for （const auto&amp;amp; s ： dev_ct1.get_info&amp;lt;sycl：：info：:d evice：：sub_group_sizes&amp;gt;（）） {&lt;BR /&gt;std：：cout &amp;lt;&amp;lt; s &amp;lt;&amp;lt; “ ”;&lt;BR /&gt;}&lt;BR /&gt;&lt;SPAN&gt;std::cout &amp;lt;&amp;lt; std::endl;//Suppose Sub-group Sizes: 8 16 32&lt;/SPAN&gt;&lt;BR /&gt;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT&gt;&lt;FONT&gt;sycl：：queue&amp;amp; q = dev_ct1.in_order_queue（）;&lt;BR /&gt;q.submit（[&amp;amp;]（sycl：：handler&amp;amp; cgh） {&lt;BR /&gt;sycl：：stream out（1024 * 1024， 256， cgh）;&lt;BR /&gt;cgh.parallel_for（&lt;BR /&gt;sycl：：nd_range&amp;lt;3&amp;gt;（sycl：：range&amp;lt;3&amp;gt;（1， 1， 32） *&lt;BR /&gt;sycl：：range&amp;lt;3&amp;gt;（1， 1， 256），&lt;BR /&gt;sycl：：range&amp;lt;3&amp;gt;（1， 1， 256）），&lt;BR /&gt;[=]（sycl：：nd_item&amp;lt;3&amp;gt; item_ct1）&lt;BR /&gt;[[intel：：reqd_sub_group_size（32）]] {&lt;BR /&gt;&lt;/FONT&gt;&lt;SPAN&gt;out &amp;lt;&amp;lt; &lt;/SPAN&gt;&lt;SPAN class=""&gt;"Used Sub-group Sizes: "&lt;/SPAN&gt;&lt;SPAN&gt; &amp;lt;&amp;lt; item_ct1.&lt;/SPAN&gt;&lt;SPAN class=""&gt;get_sub_group&lt;/SPAN&gt;&lt;SPAN&gt;().&lt;/SPAN&gt;&lt;SPAN class=""&gt;get_local_range&lt;/SPAN&gt;&lt;SPAN&gt;() &amp;lt;&amp;lt; sycl::endl; });&lt;/SPAN&gt;&lt;/FONT&gt;&lt;BR /&gt;});&lt;BR /&gt;});&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;When running in debug mode (without code optimization), the output is 16. However, when running in release mode (code optimization level of O1 or O2), the output is 32.&lt;/P&gt;&lt;P&gt;Although the &lt;SPAN&gt;desired subgroup size is set to 32&lt;/SPAN&gt; using [intel::reqd_sub_group_size(32)], the output &lt;SPAN&gt;still differs between debug and release modes. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Thank you for your help.&lt;/P&gt;&lt;P&gt;Sincerely&lt;/P&gt;</description>
      <pubDate>Tue, 21 May 2024 13:49:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Migrating-to-SYCL/Inquiry-Regarding-Inconsistent-Results-Despite-intel-reqd-sub/m-p/1599431#M293</guid>
      <dc:creator>-Light-</dc:creator>
      <dc:date>2024-05-21T13:49:01Z</dc:date>
    </item>
    <item>
      <title>Re: Inquiry Regarding Inconsistent Results Despite [[intel::reqd_sub_group_size(32)]] Specification</title>
      <link>https://community.intel.com/t5/Migrating-to-SYCL/Inquiry-Regarding-Inconsistent-Results-Despite-intel-reqd-sub/m-p/1671812#M331</link>
      <description>&lt;P&gt;Hi, just wondering if you're still seeing the issue ?&amp;nbsp; I tested on a data center gpu max 1100, but couldn't reproduce it...&lt;/P&gt;</description>
      <pubDate>Mon, 03 Mar 2025 21:13:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Migrating-to-SYCL/Inquiry-Regarding-Inconsistent-Results-Despite-intel-reqd-sub/m-p/1671812#M331</guid>
      <dc:creator>yzh_intel</dc:creator>
      <dc:date>2025-03-03T21:13:51Z</dc:date>
    </item>
  </channel>
</rss>

