<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: openMP &amp; deferred shaped array in Intel® Moderncode for Parallel Architectures</title>
    <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906290#M4440</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Why are you making the array private?&lt;BR /&gt;Usually your parallization writes stripes of the output array.&lt;BR /&gt;&lt;BR /&gt;Jim&lt;BR /&gt;</description>
    <pubDate>Wed, 21 Oct 2009 17:50:56 GMT</pubDate>
    <dc:creator>jimdempseyatthecove</dc:creator>
    <dc:date>2009-10-21T17:50:56Z</dc:date>
    <item>
      <title>openMP &amp; deferred shaped array</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906289#M4439</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I am in the middle of learning openMP and want to apply openMP to our existing software.&lt;BR /&gt;I have some deferred shaped array that is defined before parallel region, and will be used after the paralle region. &lt;BR /&gt;These arrays are also modified in the parallel portion of the code. However, I found that deferred shaped array is not permitted in an openMP firstprivate, lastprivate or reduction. I really do not want to change deferred array. What's your advise? Thanks.&lt;BR /&gt;</description>
      <pubDate>Wed, 21 Oct 2009 16:41:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906289#M4439</guid>
      <dc:creator>maria</dc:creator>
      <dc:date>2009-10-21T16:41:47Z</dc:date>
    </item>
    <item>
      <title>Re: openMP &amp; deferred shaped array</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906290#M4440</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Why are you making the array private?&lt;BR /&gt;Usually your parallization writes stripes of the output array.&lt;BR /&gt;&lt;BR /&gt;Jim&lt;BR /&gt;</description>
      <pubDate>Wed, 21 Oct 2009 17:50:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906290#M4440</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-10-21T17:50:56Z</dc:date>
    </item>
    <item>
      <title>Re: openMP &amp; deferred shaped array</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906291#M4441</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/99850"&gt;jimdempseyatthecove&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;&lt;BR /&gt;Why are you making the array private?&lt;BR /&gt;Usually your parallization writes stripes of the output array.&lt;BR /&gt;&lt;BR /&gt;Jim&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Jim,&lt;BR /&gt;&lt;BR /&gt;I have nested loops in my code with about 8 do loops nested together. The arrays are initialized at the outmost loop and updated at the innermost loop and then used again at outmost loop. However, I can only do parallel region at &lt;BR /&gt;4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.&lt;BR /&gt;&lt;BR /&gt;The structure of my loop can be described in the follow simple example. I can only parallel the inner loop &amp;amp; array A is the deferred shaped array. What should I do?&lt;BR /&gt;&lt;BR /&gt;Do I=1,N&lt;BR /&gt; read some data from hard disk and assign to array B &amp;amp; C&lt;BR /&gt; calculate ii,jj&lt;BR /&gt; A(ii,jj) = B(ii,jj) + C(ii,jj)&lt;BR /&gt;do k = 1,M&lt;BR /&gt;calculate array Z&lt;BR /&gt; A(ii,jj)=A(ii,jj)+Z(ii,jj,k)&lt;BR /&gt; enddo&lt;BR /&gt; write array A and someother array data to harddisk for later use&lt;BR /&gt;enddo&lt;BR /&gt;</description>
      <pubDate>Thu, 22 Oct 2009 13:55:34 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906291#M4441</guid>
      <dc:creator>maria</dc:creator>
      <dc:date>2009-10-22T13:55:34Z</dc:date>
    </item>
    <item>
      <title>Re: openMP &amp; deferred shaped array</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906292#M4442</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/321057"&gt;maria&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;Jim,&lt;BR /&gt;&lt;BR /&gt;I have nested loops in my code with about 8 do loops nested together. The arrays are initialized at the outmost loop and updated at the innermost loop and then used again at outmost loop. However, I can only do parallel region at &lt;BR /&gt;4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.&lt;BR /&gt;&lt;BR /&gt;The structure of my loop can be described in the follow simple example. I can only parallel the inner loop &amp;amp; array A is the deferred shaped array. What should I do?&lt;BR /&gt;&lt;BR /&gt;Do I=1,N&lt;BR /&gt; read some data from hard disk and assign to array B &amp;amp; C&lt;BR /&gt; calculate ii,jj&lt;BR /&gt; A(ii,jj) = B(ii,jj) + C(ii,jj)&lt;BR /&gt; do k = 1,M&lt;BR /&gt; calculate array Z&lt;BR /&gt; A(ii,jj)=A(ii,jj)+Z(ii,jj,k)&lt;BR /&gt; enddo&lt;BR /&gt; write array A and someother array data to harddisk for later use&lt;BR /&gt;enddo&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
This looks unfavorable for data locality under OpenMP.&lt;BR /&gt;</description>
      <pubDate>Thu, 22 Oct 2009 15:14:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906292#M4442</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-10-22T15:14:44Z</dc:date>
    </item>
    <item>
      <title>Re: openMP &amp; deferred shaped array</title>
      <link>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906293#M4443</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;&lt;EM&gt;&amp;gt;&amp;gt;However, I can only do parallel region at 4th inner do loop, since the outer loop also reads the scratch files from the harddisk which can not be parallized.&lt;BR /&gt;&lt;/EM&gt;&lt;BR /&gt;This is not necessarily true. What you have here is a candidate for a parallel pipeline. parallel_pipeline is supported in TBB (&lt;A href="http://www.threadingbuildingblocks.org"&gt;www.threadingbuildingblocks.org&lt;/A&gt; and Intel's website somewhere). Also my product QuickThread (&lt;A href="http://www.quickthreadprogramming.com"&gt;www.quickthreadprogramming.com&lt;/A&gt;) supports parallel_pipeline.&lt;BR /&gt;&lt;BR /&gt;*** HOWEVER ***&lt;BR /&gt;&lt;BR /&gt;Prior to investigating conversion away from OpenMP (considerable effort), there are a few tricks you can do to improve parallelization of your code using OpenMP. Try a state driven parallel section.&lt;BR /&gt;&lt;BR /&gt;I will outline this in incomplete pseudo code (you convert to Fortran, add data structuresand tidy up)&lt;BR /&gt;&lt;BR /&gt;bBegin = .false.&lt;BR /&gt;bEnd = .false.&lt;BR /&gt;!$omp parallel&lt;BR /&gt;if(omp_get_thread_num() == 0) then&lt;BR /&gt; ! master thread&lt;BR /&gt; &lt;INITIALIZE shared="" state="" variables="" here=""&gt;&lt;BR /&gt; bBegin = .true. ! activate other team members&lt;BR /&gt; iIn = 1&lt;BR /&gt; iOut = 1&lt;BR /&gt; do while(.not. bEnd)&lt;BR /&gt; if(availableInputBuffer(whichBuffer) .and. (iIn &amp;lt; N)) then&lt;BR /&gt; readIntoInputBufferAndMarkAsReady(whichBuffer)&lt;BR /&gt; iIn = iIn + 1&lt;BR /&gt; else if(haveOutputBuffer(whichBuffer)) then&lt;BR /&gt; writeToOuputFile(whichBuffer)&lt;BR /&gt; iOut = iOut+1 ! assumessequential writes&lt;BR /&gt; else if(haveDataToProcess(whichBuffer)) then&lt;BR /&gt; processBuffer(whichBuffer) ! also marks buffer as done&lt;BR /&gt; else&lt;BR /&gt; if(iOut &amp;gt; N) then&lt;BR /&gt; bEnd = .true. ! assumessequential writes&lt;BR /&gt; else&lt;BR /&gt; Sleep(0) ! or _mm_pause()&lt;BR /&gt; endif&lt;BR /&gt; endif&lt;BR /&gt; ! end of master thread section&lt;BR /&gt; else&lt;BR /&gt; ! thread not 0 (worker threads)&lt;BR /&gt; do while(.not. bBegin)&lt;BR /&gt; Sleep(0) ! or _mm_pause()&lt;BR /&gt; end do&lt;BR /&gt; do while(.not. bEnd)&lt;BR /&gt; if(haveDataToProcess(whichBuffer)) then&lt;BR /&gt; processBuffer(whichBuffer)&lt;BR /&gt; else&lt;BR /&gt; Sleep(0) ! or _mm_pause()&lt;BR /&gt; endif&lt;BR /&gt; enddo&lt;BR /&gt; endif&lt;BR /&gt;end do&lt;BR /&gt;!$omp end parallel&lt;BR /&gt;&lt;BR /&gt;The above can be modified such that team member thread 0 does reads (and process of buffers) and team memberthread 1 does writes (and process of buffers), all other threads only process buffers.&lt;BR /&gt;&lt;BR /&gt;I assume you will figure out that you will need omp_get_num_thread() number of sets ofbuffers but the buffers are not dedicated to specific threads. Buffers are thread-safe acquired for processing.&lt;BR /&gt;&lt;BR /&gt;Jim Dempsey&lt;/INITIALIZE&gt;</description>
      <pubDate>Thu, 22 Oct 2009 15:26:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Moderncode-for-Parallel/openMP-deferred-shaped-array/m-p/906293#M4443</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2009-10-22T15:26:10Z</dc:date>
    </item>
  </channel>
</rss>

