<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Xiaoping, in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058242#M116799</link>
    <description>&lt;P&gt;Hi Xiaoping,&lt;/P&gt;

&lt;P&gt;I added in :&lt;/P&gt;

&lt;P&gt;ulimit -s unlimited&lt;/P&gt;

&lt;P&gt;and now the code works. Thanks for your suggestion.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Hi Jim,&lt;/P&gt;

&lt;P&gt;I'm using MPI with domain decomposition so each cpu has its own region of interest. So can I still use your subroutine?&lt;/P&gt;

&lt;P&gt;If my "&lt;SPAN style="color: rgb(0, 0, 0); font-family: Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace; line-height: 14.3088006973267px;"&gt;size_needed"&amp;nbsp;&lt;/SPAN&gt;is always the same in the code, I will not need to allocate and decallocate, right?&lt;/P&gt;

&lt;P&gt;Also, what the new subroutine does is to do create an array once, after which it always stay in memory until the code ends, is that so?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I&lt;/P&gt;</description>
    <pubDate>Wed, 28 Jan 2015 04:51:04 GMT</pubDate>
    <dc:creator>Wee_Beng_T_</dc:creator>
    <dc:date>2015-01-28T04:51:04Z</dc:date>
    <item>
      <title>Difference in allocating array in subroutine make a code works or breaks</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058239#M116796</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;I am running a mpi CFD code in my sch's cluster using 100 cpu.&lt;/P&gt;

&lt;P&gt;Due to my program's structure, this is the max cpu I can use. At the same time, due to the number of grids using in my code, I am reaching the available memory limit.&lt;/P&gt;

&lt;P&gt;The code ran and hang at a spot. After debugging, I found that it hangs in one of the subroutines. In this subroutines, I have to update the values of different variables across all cpu using mpi.&lt;/P&gt;

&lt;P&gt;There are some local array which I need to create. If I declare them using:&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;subroutine mpi_var_...&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;real(8) :: var_ksta(row_num*size_x*size_y), ...&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;...&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;end subroutine&amp;nbsp;&lt;/P&gt;

&lt;P&gt;The code hangs.&lt;/P&gt;

&lt;P&gt;However, if I do this:&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;subroutine mpi_var_...&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;&lt;STRONG&gt;real(8), allocatable :: var_ksta(:)&amp;nbsp;...&lt;/STRONG&gt;&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;&lt;STRONG&gt;allocate (var_ksta(row_num*size_x*size_y)&lt;/STRONG&gt;&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;...&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;&lt;STRONG&gt;deallocate (var_ksta, STAT=status(1))&lt;/STRONG&gt;&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;end subroutine&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;The code works. So how different is memory allocated in these 2 situations?&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;If I am not tied down by memory limit, is the 1st subroutine faster or the same as the 2nd one (with allocation / deallocation slowing it down)?&lt;/P&gt;

&lt;P style="line-height: 19.5120010375977px;"&gt;Thanks!&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 07:32:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058239#M116796</guid>
      <dc:creator>Wee_Beng_T_</dc:creator>
      <dc:date>2015-01-23T07:32:30Z</dc:date>
    </item>
    <item>
      <title>In situation 1 the array is</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058240#M116797</link>
      <description>&lt;P&gt;In situation 1 the array is an automatic array which will be allocated on &amp;nbsp;stack by default. If its size is larger than your stack size setting it will cause runtime stack overflow error. The stack size can be check by shell command "ulimit -s". Compiler o&lt;SPAN style="line-height: 19.5120010375977px;"&gt;ption "-heap-array[:size]" can be used to let compiler put automatic arrays larger than a given size on heap.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;In situation 2 the array will be allocated on heap which is much larger then default stack size.&lt;/P&gt;

&lt;P&gt;Regarding performance the allocate/deallocate calls will introduce some overhead.&lt;/P&gt;

&lt;P&gt;Thanks,&lt;/P&gt;

&lt;P&gt;Xiaoping&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 08:10:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058240#M116797</guid>
      <dc:creator>Xiaoping_D_Intel</dc:creator>
      <dc:date>2015-01-23T08:10:24Z</dc:date>
    </item>
    <item>
      <title>If the subroutine is NOT</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058241#M116798</link>
      <description>&lt;P&gt;If the subroutine is NOT intended to be called concurrently by multiple threads then consider&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;real(8), allocatable, save :: var_ksta(:)...
...
size_needed = row_num*size_x*size_y
if(.not.allocated(var_ksta) then
&amp;nbsp; allocate(var_ksta(size_needed))
else
&amp;nbsp; if(size(var_ksta) .lt. size_needed) then
&amp;nbsp;&amp;nbsp;&amp;nbsp; deallocate(var_ksta)
&amp;nbsp;&amp;nbsp;&amp;nbsp; allocate(var_ksta(size_needed))
&amp;nbsp; endif
endif
&lt;/PRE&gt;

&lt;P&gt;Note, the newer feature for reallocation of left hand size likely won't be effective when consolidating ranks. While it can be done, it would then require the creation of a temporary array (IOW and an&amp;nbsp;unnecessary copy operation).&lt;/P&gt;

&lt;P&gt;Also, be mindful that var_ksta could potentially be larger than size_needed.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Fri, 23 Jan 2015 13:34:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058241#M116798</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-01-23T13:34:15Z</dc:date>
    </item>
    <item>
      <title>Hi Xiaoping,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058242#M116799</link>
      <description>&lt;P&gt;Hi Xiaoping,&lt;/P&gt;

&lt;P&gt;I added in :&lt;/P&gt;

&lt;P&gt;ulimit -s unlimited&lt;/P&gt;

&lt;P&gt;and now the code works. Thanks for your suggestion.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Hi Jim,&lt;/P&gt;

&lt;P&gt;I'm using MPI with domain decomposition so each cpu has its own region of interest. So can I still use your subroutine?&lt;/P&gt;

&lt;P&gt;If my "&lt;SPAN style="color: rgb(0, 0, 0); font-family: Consolas, 'Bitstream Vera Sans Mono', 'Courier New', Courier, monospace; line-height: 14.3088006973267px;"&gt;size_needed"&amp;nbsp;&lt;/SPAN&gt;is always the same in the code, I will not need to allocate and decallocate, right?&lt;/P&gt;

&lt;P&gt;Also, what the new subroutine does is to do create an array once, after which it always stay in memory until the code ends, is that so?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I&lt;/P&gt;</description>
      <pubDate>Wed, 28 Jan 2015 04:51:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058242#M116799</guid>
      <dc:creator>Wee_Beng_T_</dc:creator>
      <dc:date>2015-01-28T04:51:04Z</dc:date>
    </item>
    <item>
      <title>The code snip I presented was</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058243#M116800</link>
      <description>&lt;P&gt;The code snip I presented was for single thread use where the allocated array is allocated once only on first call to the subroutine. It may also be used in multi-threaded code where the array is shared.&lt;/P&gt;

&lt;P&gt;Notes: The allocated array is not deallocated upon termination of the program (process). On most modern systems this is of no consequence. The advantage of this technique&amp;nbsp;is the size need not be known at compile time. However, if the size is known at compile time, the sizeof (in bytes) is permitted to be larger than 2GB.&lt;/P&gt;

&lt;P&gt;An alternate method is to create a module, that contains the array in the data section, and allocation, deallocation, and optionally manipulation routines in the CONTAINS section. Then on program start you call the allocation routine, during program run you call the manipulation routines (or directly manipulate the array), and on program end (or when done with the array) call the deallocation/cleanup routine. The main PROGRAM (where call to allocation routine occurs) and any routine using the data/functions/subroutines will have to&amp;nbsp; USE this module.&lt;/P&gt;

&lt;P&gt;The module route eliminates the test for allocated and size_needed on each entry (when size_needed does not change).&lt;/P&gt;

&lt;P&gt;The ulimit is fine provided that you never intend to also enable multi-threading with private use of the entire array.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 28 Jan 2015 16:02:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Difference-in-allocating-array-in-subroutine-make-a-code-works/m-p/1058243#M116800</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-01-28T16:02:58Z</dc:date>
    </item>
  </channel>
</rss>

