<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads() in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1530813#M168486</link>
    <description>&lt;P&gt;It would be nice to get confirmation that the current behavior with IFX + &lt;SPAN&gt;OMP_NUM_THREADS + DO-CONCURRENT&amp;nbsp;&lt;/SPAN&gt;is &lt;STRONG&gt;not&lt;/STRONG&gt; the intended behavior, and that we can expect future versions of IFX to eventually move toward the more consistent behavior as discussed.&lt;/P&gt;&lt;P&gt;I.e. calling omp_set_num_threads(nn) from inside the code will take priority over environment variable OMP_NUM_THREADS. As it is for all of the other cases as outlined above:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for DO-CONCURRENT loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFX accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN class=""&gt;Thanks,&lt;BR /&gt;-Gerhard&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 05 Oct 2023 17:56:08 GMT</pubDate>
    <dc:creator>Theurich</dc:creator>
    <dc:date>2023-10-05T17:56:08Z</dc:date>
    <item>
      <title>DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516265#M167819</link>
      <description>&lt;P&gt;I am experimenting with DO CONCURRENT under IFX 2023.0.0 20221201 for CPU level threading. I have noticed behavior that seems less intuitive than what I find under IFORT, and inconsistent with OpenMP behavior.&lt;/P&gt;&lt;P&gt;In short, it seems that under IFX the number of threads used by a DO CONCURRENT construct is equal to the setting of environment variable OMP_NUM_THREADS if set, and cannot be overridden by the omp_set_num_threads() API. Instrumenting the same loop with&amp;nbsp;!$omp parallel do yields the expected results (i.e. the value set by omp_set_num_threads() taking priority, even for IFX). Also thread number for DO CONCURRENT under IFORT is consistent with what is set by&amp;nbsp;omp_set_num_threads(), but not for IFX.&lt;/P&gt;&lt;P&gt;I can work around the issue by explicitly removing environment variable&amp;nbsp;OMP_NUM_THREADS for IFX, then value set by&amp;nbsp;omp_set_num_threads() API. is indeed used for the DO CONCURRENT construct.&lt;/P&gt;</description>
      <pubDate>Mon, 21 Aug 2023 22:51:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516265#M167819</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-08-21T22:51:15Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516287#M167820</link>
      <description>&lt;P&gt;Could be a bug.&amp;nbsp; I'll test it out and get an answer.&amp;nbsp; I too would expect omp_set_num_threads() to override OMP_NUM_THREADS for the do concurrent.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Aug 2023 00:21:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516287#M167820</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2023-08-22T00:21:39Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516854#M167856</link>
      <description>&lt;P&gt;Hi Ron, thank you for looking into this issue. Were you able to reproduce the problem on your end?&lt;BR /&gt;The latest IFX version I have available is 23.0.0.&amp;nbsp;Maybe 23.2.0 has this resolved? Thanks.&lt;/P&gt;</description>
      <pubDate>Wed, 23 Aug 2023 15:19:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1516854#M167856</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-08-23T15:19:42Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517434#M167874</link>
      <description>&lt;P&gt;This is much more complex that I thought.&amp;nbsp; For our Front End, we mark the loop as a parallel loop, along with other information related to the data used inside the loop, and pass that to our optimizer and parallelization passes.&amp;nbsp; This is where things get interesting.&amp;nbsp; If the DO CONCURRENT is inside an outer parallel do region, the parallel optimization phase has some choices.&amp;nbsp; Like you, I thought it would just OMP thread with PARALLEL DO.&amp;nbsp; but another choice is IF this inside an outer loop, see if the loops can be collapsed.&amp;nbsp; Then the preference is to vectorize the DO CONCURRENT and not thread it.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This matches the general strategy of "parallelize the outermost loop, vectorize the innermost loop".&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So OMP threading may not be done at all!&amp;nbsp; It may just reduce the DO CONCURRENT to a normal vectorized loop.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What evidence do you have that in your case it is running the do concurrent as a threaded omp loop?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Tomorrow I hope to test a case like this.&amp;nbsp; Under Vtune.&amp;nbsp; This should show the threading behavior.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 24 Aug 2023 23:25:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517434#M167874</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2023-08-24T23:25:48Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517439#M167875</link>
      <description>&lt;P&gt;I'll just throw in that the Fortran language does not require DO CONCURRENT to be run in parallel. Rather, it establishes conditions that permit parallelization. Vectorizing is a form of parallelization.&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 00:39:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517439#M167875</guid>
      <dc:creator>Steve_Lionel</dc:creator>
      <dc:date>2023-08-25T00:39:54Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517511#M167876</link>
      <description>&lt;P&gt;Hi Steve, I do realize that the Fortran language standard does not require DO CONCURRENT to be run in parallel. The way I was arriving at my conclusions about OMP_NUM_THREADS taking priority over the value set through omp_set_num_threads() API, was as follows:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a test program that I use for testing that contains a double do loop over 200 x 100 elements. In the particular case I changed the outer (200 iterations) do loop to "do concurrent". The serial loop alone takes about 8s to execute. Long enough to watch it with top using 1s updates.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I was using this test program for a while, and have found it convenient to change the number of threads available for OpenMP or DO CONCURRENT loops via the omp_set_num_threads() API from within the program. Watching with top I noticed that for IFX it still ran single threaded, even when setting&amp;nbsp;omp_set_num_threads() to 2, 4, or 8 threads. I was baffled by this, because my experience with IFORT, and other Fortran compilers had been that I could change the number of threads this way. I finally noticed that by default, inside the interactive queue I was executing this, environment variable OMP_NUM_THREADS was set to 1. I just never bothered unsetting it before, because I was used to the&amp;nbsp;omp_set_num_threads() setting to override what came from the&amp;nbsp;OMP_NUM_THREADS environment variable. Well, I unset the&amp;nbsp;OMP_NUM_THREADS variable, and voila, I started seeing the different number of threads running in the DO CONCURRENT loop (using top), according to what I am setting with&amp;nbsp;omp_set_num_threads(). Also performance was scaling almost perfectly as expected with number of threads.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So while I agree that the Fortran standard does not require DO CONCURRENT to run in parallel, it seem that the compiler does actually generate code for it here, just that the&amp;nbsp;OMP_NUM_THREADS=1 in the environment kept it at single threaded, regardless of what I am setting with&amp;nbsp;omp_set_num_threads() from within the program itself.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;So far I have only tested with 23.0.0, but have now access to 23.2.1. I will re-test with that version soon to see if anything might have changed.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 04:31:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517511#M167876</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-08-25T04:31:55Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517662#M167879</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/309035"&gt;@Theurich&lt;/a&gt;&amp;nbsp;Testing with 23.2.1 is a good test.&amp;nbsp; We froze code for 23.0 around early October 2022.&amp;nbsp; Since then and up to code freeze for 2023.2.x we've put in 571 fixes.&amp;nbsp; In particular, DO CONCURRENT had many changes for functionality, bugs, AND performance leading up to 2023.2.0.&amp;nbsp; This includes the F2023 REDUCTION clause which could be important for you in the future.&amp;nbsp; Also, the locality-spec features, DEFAULT, LOCAL, LOCAL_INIT, SHARED, received a lot of attention in the first half of 2023, after 2023.0 released.&amp;nbsp; In short, a lot of work on DO CONCURRENT in 2023.&amp;nbsp; Now, will some of this affect your code?&amp;nbsp; Hard to say.&amp;nbsp; But I can say that if I had code with DO CONCURRENT, I would upgrade immediately.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let us know what you find.&amp;nbsp; And if find an issue, a code example could help us fix anything sub-par.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 14:27:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517662#M167879</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2023-08-25T14:27:49Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517741#M167881</link>
      <description>&lt;P&gt;&amp;gt;&amp;gt;&lt;SPAN&gt;&amp;nbsp;&lt;EM&gt;OMP_NUM_THREADS was set to 1&lt;/EM&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You did not mention that in your original post. Setting the&amp;nbsp;&lt;SPAN&gt;OMP_NUM_THREADS environment variable sets the OMP_MAX_THREADS value. Thus, places an upper limit to the omp_set_num_threads(nn) value. Note,&amp;nbsp;omp_set_num_threads(nn) upper value may also depend on if executed within a parallel region and IIF nested parallelism is enabled or not.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Jim Dempsey&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 16:40:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517741#M167881</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2023-08-25T16:40:53Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517746#M167882</link>
      <description>&lt;P&gt;However, even with&amp;nbsp;&lt;SPAN&gt;OMP_NUM_THREADS=1 in the environment:&lt;/SPAN&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for DO-CONCURRENT loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN&gt;IFX accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN&gt;Just the IFX for DO-CONCURRENT loops behaves differently. Again this is with 2023.0.0, and I am planning on testing with 2023.2.1 as soon as I can. But will have to wait 'till early next week for it.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;But are you saying that the three cases listed above are actually violating the&amp;nbsp;&lt;SPAN&gt;OMP_MAX_THREADS value set implicitly when&amp;nbsp;OMP_NUM_THREADS is found in the environment?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Aug 2023 16:47:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1517746#M167882</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-08-25T16:47:55Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1519864#M168037</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/52256"&gt;@Ron_Green&lt;/a&gt;&amp;nbsp;I finally have access to 2023.2.1 (specifically: ifx (IFX) 2023.2.0 20230721), and I tested again with DO CONCURRENT. Still the same behavior: as long as `OMP_NUM_THREADS` is set in the environment, any change from within the program via `omp_set_num_threads(nn)` are ignored. However, as soon as I unset&amp;nbsp;`OMP_NUM_THREADS`, all works as expected.&lt;/P&gt;&lt;P&gt;Not big deal to unset&amp;nbsp;`OMP_NUM_THREADS`, but it is different from the OpenMP behavior, where the API call `omp_set_num_threads(nn)` from within the program takes priority&amp;nbsp; over what comes from the environment.&lt;/P&gt;</description>
      <pubDate>Fri, 01 Sep 2023 23:07:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1519864#M168037</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-09-01T23:07:30Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1530813#M168486</link>
      <description>&lt;P&gt;It would be nice to get confirmation that the current behavior with IFX + &lt;SPAN&gt;OMP_NUM_THREADS + DO-CONCURRENT&amp;nbsp;&lt;/SPAN&gt;is &lt;STRONG&gt;not&lt;/STRONG&gt; the intended behavior, and that we can expect future versions of IFX to eventually move toward the more consistent behavior as discussed.&lt;/P&gt;&lt;P&gt;I.e. calling omp_set_num_threads(nn) from inside the code will take priority over environment variable OMP_NUM_THREADS. As it is for all of the other cases as outlined above:&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFORT accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for DO-CONCURRENT loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;LI&gt;&lt;SPAN class=""&gt;IFX accepts omp_set_num_threads(nn) with nn&amp;gt;1 fine for OpenMP loops.&lt;/SPAN&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&lt;SPAN class=""&gt;Thanks,&lt;BR /&gt;-Gerhard&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 05 Oct 2023 17:56:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1530813#M168486</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-10-05T17:56:08Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531172#M168505</link>
      <description>&lt;P&gt;Sorry for the delay in investigating this.&amp;nbsp;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/309035"&gt;@Theurich&lt;/a&gt;,&amp;nbsp;can you please share your test?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Oct 2023 14:51:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531172#M168505</guid>
      <dc:creator>Barbara_P_Intel</dc:creator>
      <dc:date>2023-10-06T14:51:36Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531240#M168507</link>
      <description>&lt;P&gt;&lt;a href="https://community.intel.com/t5/user/viewprofilepage/user-id/44501"&gt;@Barbara_P_Intel&lt;/a&gt;&amp;nbsp;, your comment reminded me of Tom Hanks and Wilson.&amp;nbsp; The package finally arrived.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks for the smile, it is better than Fortraning.&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is Fortraning a real word?&lt;/P&gt;&lt;P&gt;Should it be Fortranning?&lt;/P&gt;&lt;P&gt;or FORTRANing?&lt;/P&gt;</description>
      <pubDate>Fri, 06 Oct 2023 17:55:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531240#M168507</guid>
      <dc:creator>JohnNichols</dc:creator>
      <dc:date>2023-10-06T17:55:41Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531345#M168513</link>
      <description>&lt;P&gt;Consider the following code:&lt;/P&gt;&lt;LI-CODE lang="fortran"&gt;program demoOmpNumSet

  use omp_lib

  implicit none

  integer, parameter  :: omp_num_threads=16
  integer, parameter  :: size=10000, repeater=100
  integer             :: rep, i, j
  real, allocatable   :: a(:,:), b(:,:)
  double precision    :: t0, ti, t1, t2

  allocate(a(size,size), b(size,size))

  print *, "omp_get_num_threads: ", omp_get_num_threads()
  print *, "omp_get_max_threads: ", omp_get_max_threads()
  print *
  print *, "omp_set_num_threads: ", omp_num_threads
  call omp_set_num_threads(omp_num_threads)
  print *
  print *, "omp_get_num_threads: ", omp_get_num_threads()
  print *, "omp_get_max_threads: ", omp_get_max_threads()

  t0 = omp_get_wtime()

  call random_number(b)

  ti = omp_get_wtime()

  do rep=1, repeater
    !$omp parallel do
    do j=1, size
    do i=1, size
      a(i,j) = b(i,j) * b(i,j)
    enddo
    enddo
    !$omp end parallel do
  enddo

  t1 = omp_get_wtime()

  do rep=1, repeater
    do concurrent (j=1:size)
    do i=1, size
      a(i,j) = b(i,j) * b(i,j)
    enddo
    enddo
  enddo

  t2 = omp_get_wtime()

  print *, "Time to initialize: ", ti-t0
  print *, "Time OpenMP loop:   ", t1-ti
  print *, "Time DO-CONCURRNET: ", t2-t1

end program&lt;/LI-CODE&gt;&lt;P&gt;I vary the&amp;nbsp;omp_num_threads parameter in the different tests, also I adjust the OMP_NUM_THREADS environment variable or completely unset it. Then I observe the time the loop execution takes, but also watch how many threads are active on the system executing the code with top.&lt;/P&gt;&lt;P&gt;With IFORT (2021.8.0 20221119) both OMP and DO-CONCURRENT behave as expected, i.e. the value set by omp_set_num_threads() within the code determines the number of threads used by either approach, regardless whether or how OMP_NUM_THREADS environment variable is set:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=1:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;Time to initialize: 0.276350975036621&lt;BR /&gt;Time OpenMP loop: 3.36269402503967&lt;BR /&gt;Time DO-CONCURRNET: 3.35191202163696&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=2:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 2&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 2&lt;BR /&gt;Time to initialize: 0.276911020278931&lt;BR /&gt;Time OpenMP loop: 2.78357601165771&lt;BR /&gt;Time DO-CONCURRNET: 2.87240791320801&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=4:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 4&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 4&lt;BR /&gt;Time to initialize: 0.276546955108643&lt;BR /&gt;Time OpenMP loop: 2.19221901893616&lt;BR /&gt;Time DO-CONCURRNET: 2.16165018081665&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;unset OMP_NUM_THREAD, omp_num_threads=4:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 256&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 4&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 4&lt;BR /&gt;Time to initialize: 0.313963890075684&lt;BR /&gt;Time OpenMP loop: 2.22668409347534&lt;BR /&gt;Time DO-CONCURRNET: 2.30029201507568&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;However, with IFX (2023.0.0 20221201) things change! OMP still works as before, but the number of threads used by DO-CONCURRENT seem to be fixed by the environment. So if OMP_NUM_THREADS environment variable is set, it takes the value from there, and if unset, default to 256 on the compute nodes I am working on.&lt;/P&gt;&lt;P&gt;Notice also that the initialization time goes up, but I am not concerned about that here.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=1:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;Time to initialize: 1.28865694999695&lt;BR /&gt;Time OpenMP loop: 3.31528496742249&lt;BR /&gt;Time DO-CONCURRNET: 3.30231809616089&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=2:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 2&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 2&lt;BR /&gt;Time to initialize: 1.28990292549133&lt;BR /&gt;Time OpenMP loop: 2.97741007804871&lt;BR /&gt;Time DO-CONCURRNET: 3.40985584259033&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;OMP_NUM_THREAD=1, omp_num_threads=4:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 1&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 4&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 4&lt;BR /&gt;Time to initialize: 1.28875398635864&lt;BR /&gt;Time OpenMP loop: 1.80025887489319&lt;BR /&gt;Time DO-CONCURRNET: 3.32655405998230&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;STRONG&gt;unset OMP_NUM_THREAD, omp_num_threads=4:&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 256&lt;BR /&gt;&lt;BR /&gt;omp_set_num_threads: 4&lt;BR /&gt;&lt;BR /&gt;omp_get_num_threads: 1&lt;BR /&gt;omp_get_max_threads: 4&lt;BR /&gt;Time to initialize: 1.28911781311035&lt;BR /&gt;Time OpenMP loop: 2.22033119201660&lt;BR /&gt;Time DO-CONCURRNET: 1.81347393989563&lt;STRONG&gt;&lt;BR /&gt;&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Obviously this is a&amp;nbsp; very artificial example, and I am not really concerned about specific performance or anything here. In fact my main concern is to understand how we can set the number of threads available to DO-CONCURRENT from inside the executable. With IFORT this worked fine based on omp_set_num_threads(), but with IFX, this mechanism seems to no longer work. Thanks.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 06 Oct 2023 23:17:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531345#M168513</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-10-06T23:17:51Z</dc:date>
    </item>
    <item>
      <title>Re: DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531890#M168547</link>
      <description>&lt;P&gt;Thank you for sharing your reproducer. I started looking at a matmul example and see a similar performance difference with ifx between OMP and DO CONCURRENT.&lt;/P&gt;
&lt;P&gt;The Fortran developers like to see multiple reproducers to test their fixes.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 10 Oct 2023 23:50:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1531890#M168547</guid>
      <dc:creator>Barbara_P_Intel</dc:creator>
      <dc:date>2023-10-10T23:50:18Z</dc:date>
    </item>
    <item>
      <title>Re:DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1532459#M168578</link>
      <description>&lt;P&gt;I filed a bug report for this issue, CMPLRLLVM-52450. Will keep you posted on its progress to a fix.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 10 Oct 2023 23:25:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1532459#M168578</guid>
      <dc:creator>Barbara_P_Intel</dc:creator>
      <dc:date>2023-10-10T23:25:51Z</dc:date>
    </item>
    <item>
      <title>Re: Re:DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1532689#M168584</link>
      <description>&lt;P&gt;Awesome, and thanks for letting me know. I am looking forward to seeing how this progresses. Thanks!&lt;/P&gt;&lt;P&gt;-Gerhard&lt;/P&gt;</description>
      <pubDate>Wed, 11 Oct 2023 15:12:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1532689#M168584</guid>
      <dc:creator>Theurich</dc:creator>
      <dc:date>2023-10-11T15:12:41Z</dc:date>
    </item>
    <item>
      <title>Re:DO CONCURRENT with IFX 2023.0.0 20221201 uses OMP_NUM_THREADS over omp_set_num_threads()</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1535885#M168759</link>
      <description>&lt;P&gt;I learned something new about DO CONCURRENT and -qopenmp. For CPU the DO CONCURRENT is translated to OMP SIMD directives. So &lt;SPAN style="font-size: 14px;"&gt;omp_set_num_threads() has no impact!&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-size: 14px;"&gt;The parallel optimization team is working to implement DO CONCURRENT with OMP PARALLEL DO in a future release.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;I'll keep you posted on the progress.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;&lt;/P&gt;&lt;BR /&gt;</description>
      <pubDate>Fri, 20 Oct 2023 15:31:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/DO-CONCURRENT-with-IFX-2023-0-0-20221201-uses-OMP-NUM-THREADS/m-p/1535885#M168759</guid>
      <dc:creator>Barbara_P_Intel</dc:creator>
      <dc:date>2023-10-20T15:31:46Z</dc:date>
    </item>
  </channel>
</rss>

