<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Performance problem of MonteCarlo integration in Intel® oneAPI DPC++/C++ Compiler</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280014#M1142</link>
    <description>&lt;P&gt;Hello &lt;BR /&gt;&lt;BR /&gt;I am comparing the runtime performance of a simple sample/reject &lt;BR /&gt;Monte-Carlo integration scheme. &lt;BR /&gt;&lt;BR /&gt;The program is run on the following computer &lt;BR /&gt;model name : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz &lt;BR /&gt;&lt;BR /&gt;The code is attached to this report. &lt;BR /&gt;./M1.exe M (number of polynomials) N (number of trials) T(number threads) &lt;BR /&gt;&lt;BR /&gt;With 32 OpenMP threads the DPCPP compiled program is approximately 3 times slower &lt;BR /&gt;than the one compiled with GCC. &lt;BR /&gt;&lt;BR /&gt;dpcpp -O3 -fopenmp -Wall -funroll-loops -ffast-math monte_carlo_integration.cpp -o MC_DPCPP.exe &lt;BR /&gt;time ./MC_DPCPP.exe 1000 10000000 32 &amp;gt; /dev/null &lt;BR /&gt;&lt;BR /&gt;real 1m17.954s &lt;BR /&gt;user 36m22.515s &lt;BR /&gt;sys 0m0.401s &lt;BR /&gt;&lt;BR /&gt;g++ -O3 -fopenmp -funroll-loops -ffast-math -fprofile-use monte_carlo_integration.cpp -o M1.exe &lt;BR /&gt;GCC 11.1 &lt;BR /&gt;./M1_GCC111.exe 1000 10000000 32 &amp;gt; /dev/null &lt;BR /&gt;&lt;BR /&gt;real 0m23.694s &lt;BR /&gt;user 11m8.420s &lt;BR /&gt;sys 0m0.019s &lt;/P&gt;
&lt;P&gt;GCC 8.3.1 &lt;BR /&gt;time ./MC_POLY 1000 10000000 32 &amp;gt; /dev/null&lt;/P&gt;
&lt;P&gt;real 0m26.024s&lt;BR /&gt;user 12m21.249s&lt;BR /&gt;sys 0m0.020s&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Running perf stat on the two binaries gives&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;GCC 8.3.1&lt;BR /&gt;Performance counter stats for './MC_POLY 10 10000000 1':&lt;/P&gt;
&lt;P&gt;5,619.33 msec task-clock:u # 1.000 CPUs utilized&lt;BR /&gt;0 context-switches:u # 0.000 K/sec&lt;BR /&gt;0 cpu-migrations:u # 0.000 K/sec&lt;BR /&gt;167 page-faults:u # 0.030 K/sec&lt;BR /&gt;17,887,915,840 cycles:u # 3.183 GHz&lt;BR /&gt;30,678,358,310 instructions:u # 1.72 insn per cycle&lt;BR /&gt;4,101,348,797 branches:u # 729.864 M/sec&lt;BR /&gt;226,816,326 branch-misses:u # 5.53% of all branches&lt;/P&gt;
&lt;P&gt;5.620014706 seconds time elapsed&lt;/P&gt;
&lt;P&gt;5.609363000 seconds user&lt;BR /&gt;0.001990000 seconds sys&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Performance counter stats for './MC_DPCPP.exe 10 10000000 1':&lt;/P&gt;
&lt;P&gt;15,906.43 msec task-clock:u # 1.000 CPUs utilized&lt;BR /&gt;0 context-switches:u # 0.000 K/sec&lt;BR /&gt;0 cpu-migrations:u # 0.000 K/sec&lt;BR /&gt;651 page-faults:u # 0.041 K/sec&lt;BR /&gt;49,488,407,800 cycles:u # 3.111 GHz&lt;BR /&gt;82,102,796,894 instructions:u # 1.66 insn per cycle&lt;BR /&gt;6,603,124,397 branches:u # 415.123 M/sec&lt;BR /&gt;3,099,931 branch-misses:u # 0.05% of all branches&lt;/P&gt;
&lt;P&gt;15.911192960 seconds time elapsed&lt;/P&gt;</description>
    <pubDate>Sat, 08 May 2021 19:18:39 GMT</pubDate>
    <dc:creator>SandeepKoranne</dc:creator>
    <dc:date>2021-05-08T19:18:39Z</dc:date>
    <item>
      <title>Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280014#M1142</link>
      <description>&lt;P&gt;Hello &lt;BR /&gt;&lt;BR /&gt;I am comparing the runtime performance of a simple sample/reject &lt;BR /&gt;Monte-Carlo integration scheme. &lt;BR /&gt;&lt;BR /&gt;The program is run on the following computer &lt;BR /&gt;model name : Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz &lt;BR /&gt;&lt;BR /&gt;The code is attached to this report. &lt;BR /&gt;./M1.exe M (number of polynomials) N (number of trials) T(number threads) &lt;BR /&gt;&lt;BR /&gt;With 32 OpenMP threads the DPCPP compiled program is approximately 3 times slower &lt;BR /&gt;than the one compiled with GCC. &lt;BR /&gt;&lt;BR /&gt;dpcpp -O3 -fopenmp -Wall -funroll-loops -ffast-math monte_carlo_integration.cpp -o MC_DPCPP.exe &lt;BR /&gt;time ./MC_DPCPP.exe 1000 10000000 32 &amp;gt; /dev/null &lt;BR /&gt;&lt;BR /&gt;real 1m17.954s &lt;BR /&gt;user 36m22.515s &lt;BR /&gt;sys 0m0.401s &lt;BR /&gt;&lt;BR /&gt;g++ -O3 -fopenmp -funroll-loops -ffast-math -fprofile-use monte_carlo_integration.cpp -o M1.exe &lt;BR /&gt;GCC 11.1 &lt;BR /&gt;./M1_GCC111.exe 1000 10000000 32 &amp;gt; /dev/null &lt;BR /&gt;&lt;BR /&gt;real 0m23.694s &lt;BR /&gt;user 11m8.420s &lt;BR /&gt;sys 0m0.019s &lt;/P&gt;
&lt;P&gt;GCC 8.3.1 &lt;BR /&gt;time ./MC_POLY 1000 10000000 32 &amp;gt; /dev/null&lt;/P&gt;
&lt;P&gt;real 0m26.024s&lt;BR /&gt;user 12m21.249s&lt;BR /&gt;sys 0m0.020s&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Running perf stat on the two binaries gives&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;GCC 8.3.1&lt;BR /&gt;Performance counter stats for './MC_POLY 10 10000000 1':&lt;/P&gt;
&lt;P&gt;5,619.33 msec task-clock:u # 1.000 CPUs utilized&lt;BR /&gt;0 context-switches:u # 0.000 K/sec&lt;BR /&gt;0 cpu-migrations:u # 0.000 K/sec&lt;BR /&gt;167 page-faults:u # 0.030 K/sec&lt;BR /&gt;17,887,915,840 cycles:u # 3.183 GHz&lt;BR /&gt;30,678,358,310 instructions:u # 1.72 insn per cycle&lt;BR /&gt;4,101,348,797 branches:u # 729.864 M/sec&lt;BR /&gt;226,816,326 branch-misses:u # 5.53% of all branches&lt;/P&gt;
&lt;P&gt;5.620014706 seconds time elapsed&lt;/P&gt;
&lt;P&gt;5.609363000 seconds user&lt;BR /&gt;0.001990000 seconds sys&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Performance counter stats for './MC_DPCPP.exe 10 10000000 1':&lt;/P&gt;
&lt;P&gt;15,906.43 msec task-clock:u # 1.000 CPUs utilized&lt;BR /&gt;0 context-switches:u # 0.000 K/sec&lt;BR /&gt;0 cpu-migrations:u # 0.000 K/sec&lt;BR /&gt;651 page-faults:u # 0.041 K/sec&lt;BR /&gt;49,488,407,800 cycles:u # 3.111 GHz&lt;BR /&gt;82,102,796,894 instructions:u # 1.66 insn per cycle&lt;BR /&gt;6,603,124,397 branches:u # 415.123 M/sec&lt;BR /&gt;3,099,931 branch-misses:u # 0.05% of all branches&lt;/P&gt;
&lt;P&gt;15.911192960 seconds time elapsed&lt;/P&gt;</description>
      <pubDate>Sat, 08 May 2021 19:18:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280014#M1142</guid>
      <dc:creator>SandeepKoranne</dc:creator>
      <dc:date>2021-05-08T19:18:39Z</dc:date>
    </item>
    <item>
      <title>Re:Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280179#M1143</link>
      <description>&lt;P&gt;Hi Sandeep,&lt;/P&gt;&lt;P&gt;Thanks for reaching out to us.&lt;/P&gt;&lt;P&gt;Could you please provide us the details of DPC++ compiler version on which you are working?&lt;/P&gt;&lt;P&gt;Meanwhile we will look into this issue internally. we will get back to you soon.&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Vidya.&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 10 May 2021 08:17:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280179#M1143</guid>
      <dc:creator>VidyalathaB_Intel</dc:creator>
      <dc:date>2021-05-10T08:17:22Z</dc:date>
    </item>
    <item>
      <title>Re: Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280239#M1144</link>
      <description>&lt;P&gt;Thanks Vidya&lt;/P&gt;
&lt;P&gt;Intel(R) oneAPI DPC++ Compiler 2021.2.0 (2021.2.0.20210317)&lt;BR /&gt;Target: x86_64-unknown-linux-gnu&lt;/P&gt;
&lt;P&gt;This is the version I am using.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regards,&lt;/P&gt;
&lt;P&gt;Sandeep&lt;/P&gt;</description>
      <pubDate>Mon, 10 May 2021 14:25:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280239#M1144</guid>
      <dc:creator>SandeepKoranne</dc:creator>
      <dc:date>2021-05-10T14:25:14Z</dc:date>
    </item>
    <item>
      <title>Re:Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280527#M1148</link>
      <description>&lt;P&gt;Hi Sandeep, &lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;I've reported this problem to our Developer. &lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 11 May 2021 12:52:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1280527#M1148</guid>
      <dc:creator>Viet_H_Intel</dc:creator>
      <dc:date>2021-05-11T12:52:55Z</dc:date>
    </item>
    <item>
      <title>Re: Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1283802#M1173</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Is there any update to this issue ?&lt;/P&gt;
&lt;P&gt;Even single threaded performance is much (3x) slower than gcc. Is this due to LLVM not able to optimize lambda[] functions ?&lt;/P&gt;
&lt;P&gt;Sandeep&lt;/P&gt;</description>
      <pubDate>Sat, 22 May 2021 19:15:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1283802#M1173</guid>
      <dc:creator>SandeepKoranne</dc:creator>
      <dc:date>2021-05-22T19:15:08Z</dc:date>
    </item>
    <item>
      <title>Re:Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1284617#M1222</link>
      <description>&lt;P&gt;Sorry, we don't have any update yet on this issue.  &lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 25 May 2021 20:41:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1284617#M1222</guid>
      <dc:creator>Viet_H_Intel</dc:creator>
      <dc:date>2021-05-25T20:41:36Z</dc:date>
    </item>
    <item>
      <title>Re:Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1392118#M2272</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;This issue has been addressed. The next update will show icpx is much faster  -fiopenmp.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 13 Jun 2022 17:08:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1392118#M2272</guid>
      <dc:creator>Viet_H_Intel</dc:creator>
      <dc:date>2022-06-13T17:08:29Z</dc:date>
    </item>
    <item>
      <title>Re:Performance problem of MonteCarlo integration</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1419207#M2569</link>
      <description>&lt;P&gt;Please upgrade to oneAPI2022.3 which addressed this issue.&lt;/P&gt;&lt;P&gt;I am going to close this thread.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 04 Oct 2022 00:11:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-DPC-C-Compiler/Performance-problem-of-MonteCarlo-integration/m-p/1419207#M2569</guid>
      <dc:creator>Viet_H_Intel</dc:creator>
      <dc:date>2022-10-04T00:11:05Z</dc:date>
    </item>
  </channel>
</rss>

