<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Multithreading Big loop containing several loops inside in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871262#M2937</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;email me at&lt;/P&gt;
&lt;P&gt; j i m _ d e m p s e y @ a m e r i t e c h . n e t&lt;/P&gt;
&lt;P&gt;(remove the spaces)&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
    <pubDate>Wed, 08 Oct 2008 23:46:04 GMT</pubDate>
    <dc:creator>jimdempseyatthecove</dc:creator>
    <dc:date>2008-10-08T23:46:04Z</dc:date>
    <item>
      <title>Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871257#M2932</link>
      <description>&lt;P&gt;Hi all,&lt;BR /&gt;&lt;BR /&gt;Below please find my program. The program basically has a big loop and inside the big loop there are several loops that has to be executed in a certain way ( I put some remarks to show how it should be executed). Basically, every group of loops has to be exceuted fully " i.e all variables be updated before moving to the next loops". please let me know the best way and commands to use to ensure that the code is parallized only in the sequence showed in the code. Thanks,&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do kk=1,temp The Big Do LOOP&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do i=nx1+1,nx2-2&lt;BR /&gt;do j=ny1+2,ny2-2&lt;BR /&gt;&lt;BR /&gt;stuff1&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do i=nx1+2,nx2-2&lt;BR /&gt;do j=ny1+1,ny2-2&lt;BR /&gt;&lt;BR /&gt;stuff2&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;/STUFF1 and STUFF2 can bw don in parellel but has to be completed before the below stuff&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;es0=0.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do i=nx1,nx2&lt;BR /&gt;do j=ny1,ny2-1&lt;BR /&gt;do k=nz1,nz2-1&lt;BR /&gt;stuff3&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do i=nx1,nx2-1&lt;BR /&gt;do j=ny1,ny2&lt;BR /&gt;do k=nz1,nz2-1&lt;BR /&gt;stuff4&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do i=nx1,nx2-1&lt;BR /&gt;do j=ny1,ny2-1&lt;BR /&gt;do k=nz1,nz2&lt;BR /&gt;stuff5&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;/STUFF3 and STUFF4 and STUFF5 can be done in parellel but has to be completed before the below stuff&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;call function1&lt;BR /&gt;call function2&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;c&lt;BR /&gt;c update the E_field&lt;BR /&gt;c&lt;BR /&gt;c Main&lt;BR /&gt;c&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do j=ny1+1,ny2-1&lt;BR /&gt;do i=nx1,nx2-1&lt;BR /&gt;do k=nz1+1,nz2-1&lt;BR /&gt;STUFF6 &lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do j=ny1,ny2-1&lt;BR /&gt;do i=nx1+1,nx2-1&lt;BR /&gt;do k=nz1+1,nz2-1 &lt;BR /&gt;STUFF7&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;do j=ny1+1,ny2-1&lt;BR /&gt;do i=nx1+1,nx2-1&lt;BR /&gt;do k=nz1,nz2-1 &lt;BR /&gt;STUFF8&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;/STUFF6 and STUFF7 and STUFF8 can be done in parellel but has to be completed before compiling the below stuff&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt; call function3&lt;BR /&gt; call function4&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;enddo "ending the big do&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 07:19:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871257#M2932</guid>
      <dc:creator>yafayez</dc:creator>
      <dc:date>2008-10-08T07:19:50Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871258#M2933</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;Is kk used inside your STUFF routines to select independent data sets?&lt;/P&gt;
&lt;P&gt;If so, then can STUFF(...,kk) be executed in random order? (i.e. kk not dependent on kk-1)&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 13:02:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871258#M2933</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-08T13:02:26Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871259#M2934</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/99850"&gt;jimdempseyatthecove&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Thanks first for your reply. no, the stuff at kk cannot be executed in random order because values of iterations from k-1 pass to k and so on "i.e. they are dependent". Please let me know your thoughts on these. Thanks again,&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Is kk used inside your STUFF routines to select independent data sets?&lt;/P&gt;
&lt;P&gt;If so, then can STUFF(...,kk) be executed in random order? (i.e. kk not dependent on kk-1)&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 17:23:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871259#M2934</guid>
      <dc:creator>yafayez</dc:creator>
      <dc:date>2008-10-08T17:23:13Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871260#M2935</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;You might start with something like the following:&lt;/P&gt;
&lt;P&gt;!$omp parallel&lt;BR /&gt;!$omp do private(i,j)&lt;BR /&gt;do i=nx1+1,nx2-2&lt;BR /&gt;do j=ny1+2,ny2-2&lt;/P&gt;
&lt;P&gt; stuff1&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end do nowait&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;!$omp do private(i,j)&lt;BR /&gt;do i=nx1+2,nx2-2&lt;BR /&gt;do j=ny1+1,ny2-2&lt;/P&gt;
&lt;P&gt; stuff2&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end do nowait&lt;BR /&gt;!$omp end parallel&lt;/P&gt;
&lt;P&gt;Note, depending on what is inside of stuff1 and stuff2 you may need to use different temporaries (or make subroutines out of stuff1 and stuff2 and passing i and j into the routines.&lt;/P&gt;
&lt;P&gt;Also, if the computation overhead varies per iteration then experiment with adding the schedule clause.&lt;/P&gt;
&lt;P&gt;Once you get the above working for stuff1 and stuff2 apply what you learned to the remaining stuff sections.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 19:25:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871260#M2935</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-08T19:25:32Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871261#M2936</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/99850"&gt;jimdempseyatthecove&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I added the commands and i see that all processors are working but the program is much slower. How can i detect bootlenick. Thanks, is there a phone number i can call you at to discuss it more. Thanks again,&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Yasser&lt;/P&gt;
&lt;P&gt;You might start with something like the following:&lt;/P&gt;
&lt;P&gt;!$omp parallel&lt;BR /&gt;!$omp do private(i,j)&lt;BR /&gt;do i=nx1+1,nx2-2&lt;BR /&gt;do j=ny1+2,ny2-2&lt;/P&gt;
&lt;P&gt; stuff1&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end do nowait&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;!$omp do private(i,j)&lt;BR /&gt;do i=nx1+2,nx2-2&lt;BR /&gt;do j=ny1+1,ny2-2&lt;/P&gt;
&lt;P&gt; stuff2&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end do nowait&lt;BR /&gt;!$omp end parallel&lt;/P&gt;
&lt;P&gt;Note, depending on what is inside of stuff1 and stuff2 you may need to use different temporaries (or make subroutines out of stuff1 and stuff2 and passing i and j into the routines.&lt;/P&gt;
&lt;P&gt;Also, if the computation overhead varies per iteration then experiment with adding the schedule clause.&lt;/P&gt;
&lt;P&gt;Once you get the above working for stuff1 and stuff2 apply what you learned to the remaining stuff sections.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 20:26:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871261#M2936</guid>
      <dc:creator>yafayez</dc:creator>
      <dc:date>2008-10-08T20:26:41Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871262#M2937</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;email me at&lt;/P&gt;
&lt;P&gt; j i m _ d e m p s e y @ a m e r i t e c h . n e t&lt;/P&gt;
&lt;P&gt;(remove the spaces)&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 23:46:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871262#M2937</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-08T23:46:04Z</dc:date>
    </item>
    <item>
      <title>Re: Multithreading Big loop containing several loops inside</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871263#M2938</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;&amp;gt;&amp;gt;I added the commands and i see that all processors are working but the program is much slower. How can i detect bootlenick.&lt;/P&gt;
&lt;P&gt;This is symptomatic of the inner OpenMP do loop running serially. Place the "!$OMP DO ..." (or "C$OMP DO ..." at the left margine.&lt;/P&gt;
&lt;P&gt;Should this not improve matters then try&lt;/P&gt;
&lt;P&gt;!$omp parallel do private(i,j)&lt;BR /&gt;do i=nx1+1,nx2-2&lt;BR /&gt;do j=ny1+2,ny2-2&lt;/P&gt;
&lt;P&gt; stuff1&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end parallel do&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;!$omp parallel do private(i,j)&lt;BR /&gt;do i=nx1+2,nx2-2&lt;BR /&gt;do j=ny1+1,ny2-2&lt;/P&gt;
&lt;P&gt; stuff2&lt;BR /&gt;enddo&lt;BR /&gt;enddo&lt;BR /&gt;!$omp end parallel do&lt;/P&gt;
&lt;P&gt;Note, the above is not inside an !$OMP PARALLEL region&lt;/P&gt;
&lt;P&gt;The purpose of coding the first way was to permit the threads finishing the STUFF1 loop first to begin processing the STUFF2 loop prior to the remaining threads working on STUFF1 loop finishing.&lt;/P&gt;
&lt;P&gt;If this too does not improve the performance then the code in STUFF1 and STUFF2 are likely memory copy statements as opposed to computational statements.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Thu, 09 Oct 2008 12:23:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/Multithreading-Big-loop-containing-several-loops-inside/m-p/871263#M2938</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-09T12:23:17Z</dc:date>
    </item>
  </channel>
</rss>

