<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re:A slower performance when using multi-devices in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1405965#M2414</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We are closing this issue. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 04 Aug 2022 10:21:53 GMT</pubDate>
    <dc:creator>SantoshY_Intel</dc:creator>
    <dc:date>2022-08-04T10:21:53Z</dc:date>
    <item>
      <title>A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390646#M2261</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I try to develop a code based on multi-devices (DPC++ &amp;amp; MPI), I use USM and shared memory. When I do the scaling work, I find that the multi-device performance is worse than the single device performance. I think the problem scale is large enough, so multi-device should work better. Does anyone have any advice for that?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;By the way, is there a way to make sure that I am using 16 GPUs when I run the problem using "mpirun -np 16 ./main"? I output the name of the devices, but they have the same name which is the same brand I think.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Chunheng.&lt;/P&gt;</description>
      <pubDate>Tue, 07 Jun 2022 16:12:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390646#M2261</guid>
      <dc:creator>zchmacchiato</dc:creator>
      <dc:date>2022-06-07T16:12:29Z</dc:date>
    </item>
    <item>
      <title>Re: A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390853#M2263</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you for posting in Intel Communities.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Could you please provide us with the following details?&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;The operating system you are using.&lt;/LI&gt;
&lt;LI&gt;Intel MPI Library &amp;amp; DPC++ versions you are using.&lt;/LI&gt;
&lt;LI&gt;A sample reproducer code and steps to reproduce your issue from our end. (commands to compile &amp;amp; run the code on multi-devices)&lt;/LI&gt;
&lt;LI&gt;Name of GPU you are using &amp;amp; Environment details of your cluster.&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;FONT color="#808080"&gt;&lt;I&gt;&amp;gt;&amp;gt;"I find that the multi-device performance is worse than the single device performance."&lt;/I&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Could you please let us know how you are measuring the performance?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks &amp;amp; Regards,&lt;/P&gt;
&lt;P&gt;Santosh&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jun 2022 06:40:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390853#M2263</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-06-08T06:40:15Z</dc:date>
    </item>
    <item>
      <title>Re: A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390940#M2264</link>
      <description>&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;I attach my code below.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;I run my code on ThetaGPU.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;The system information is:&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;#101-Ubuntu SMP Fri Oct 15 20:00:55 UTC 2021&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;The GPU I use is: Selected device: NVIDIA A100-SXM4-40GB&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;I am not quite sure about the DPC++ or OneAPI version, but it is for Ubuntu 18.04.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;I measure the performance by mega lattice updates per second, run the solver by 100 times and get the average time.&lt;/SPAN&gt;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&amp;nbsp;&lt;/P&gt;
&lt;P class="p1 sub_section_element_selectors"&gt;&lt;SPAN class="s1 sub_section_element_selectors"&gt;Chunheng.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Jun 2022 12:25:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1390940#M2264</guid>
      <dc:creator>zchmacchiato</dc:creator>
      <dc:date>2022-06-08T12:25:11Z</dc:date>
    </item>
    <item>
      <title>Re: A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1391295#M2267</link>
      <description>&lt;P&gt;I also attach my makefile here,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Chunheng.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Jun 2022 13:52:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1391295#M2267</guid>
      <dc:creator>zchmacchiato</dc:creator>
      <dc:date>2022-06-09T13:52:26Z</dc:date>
    </item>
    <item>
      <title>Re:A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1403058#M2386</link>
      <description>&lt;P&gt;Hi Chunheng,&lt;/P&gt;&lt;P&gt;Thank you for your inquiry.&amp;nbsp;We offer support for hardware platforms that the Intel® oneAPI product supports.&amp;nbsp;These platforms include those that are part of the Intel® Core™ processor family or higher, the Intel® Xeon® processor family, the Intel® Xeon® Scalable processor family, and others which can be found here – &lt;A href="https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-base-toolkit-system-requirements.html" rel="noopener noreferrer" target="_blank"&gt;Intel® oneAPI Base Toolkit System Requirements&lt;/A&gt;, &lt;A href="https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-hpc-toolkit-system-requirements.html" rel="noopener noreferrer" target="_blank"&gt;Intel® oneAPI HPC Toolkit System Requirements&lt;/A&gt;, &lt;A href="https://software.intel.com/content/www/us/en/develop/articles/intel-oneapi-iot-toolkit-system-requirements.html" rel="noopener noreferrer" target="_blank"&gt;Intel® oneAPI IoT Toolkit System Requirements&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Calibri, sans-serif; font-size: 11pt;"&gt;If you wish to use oneAPI on hardware that is not listed at one of the sites above, we encourage you to visit and contribute to the open oneAPI specification - &lt;/SPAN&gt;&lt;A href="https://www.oneapi.io/spec/" rel="noopener noreferrer" target="_blank" style="font-family: Calibri, sans-serif; font-size: 11pt;"&gt;https://www.oneapi.io/spec/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Best regards, &lt;/P&gt;&lt;P&gt;Jyotsna&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 25 Jul 2022 11:31:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1403058#M2386</guid>
      <dc:creator>JyotsnaK_Intel</dc:creator>
      <dc:date>2022-07-25T11:31:42Z</dc:date>
    </item>
    <item>
      <title>Re:A slower performance when using multi-devices</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1405965#M2414</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We are closing this issue. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Santosh&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 04 Aug 2022 10:21:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/A-slower-performance-when-using-multi-devices/m-p/1405965#M2414</guid>
      <dc:creator>SantoshY_Intel</dc:creator>
      <dc:date>2022-08-04T10:21:53Z</dc:date>
    </item>
  </channel>
</rss>

