<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Stack size in fortran using OpenMP in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872755#M72713</link>
    <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/160574"&gt;Ronald Green (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Yes, OpenMP generally requires substantially more stack space than a serial program. All your PRIVATE data is stack allocated, as each thread needs a private copy (stacks are NOT shared by threads, heap is).&lt;/P&gt;
&lt;P&gt;I don't believe it's your C -&amp;gt; Fortran calling that is eating up stack space. It is the PRIVATE data in your OMP regions. You could try to revisit your declaration of data in the OMP parallel regions and see if there is data that can be made shared, but in many cases you really do want PRIVATE data (for data safety and correctness).&lt;/P&gt;
&lt;P&gt;ron&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi again!&lt;/P&gt;
&lt;P&gt;I managed to solve the stack size by allocating several vectors dynamically instead of statically. I ended up with a stack size of about 10 MB. Is that reasonably? Can the stack size affect the performance in anyway?&lt;/P&gt;
&lt;P&gt;A related question, will the stack size affect the threading efficiency. What kind of overhead is it to allocate the stack for each thread?&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 20 Oct 2008 14:29:16 GMT</pubDate>
    <dc:creator>davva</dc:creator>
    <dc:date>2008-10-20T14:29:16Z</dc:date>
    <item>
      <title>Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872747#M72705</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;
&lt;P&gt;I am running a C++ scientific calculation program that heavily relies on fortran code. A number of large matrices are used (128*128*128) and passed around in fortran. Right now most of them reside in a common block. They are also passed as arguments in different subroutines where they are allocated as&lt;/P&gt;
&lt;P&gt;real*4 matrix(128*128*128).&lt;/P&gt;
&lt;P&gt;I am suspecting this to be the reason why I need such a big stack size (150 MB) when I parallellized the program with OpenMP. Is this a correct assumption? Without OpenMP the stack size need to be around 10 MB.&lt;/P&gt;
&lt;P&gt;What can I do to reduce the stack size? Allocate the matrices dynamically?&lt;/P&gt;
&lt;P&gt;How do I pass dynamically allocated matrizes in C (std::vector) to fortran without declaring the matrix as real*4 matrix(128*128*128) inside the subroutine and thus needing a big stack size? That is, how do I pass C vectors into fortran and treat them as dynamically allocated vectors in fortran?&lt;/P&gt;
&lt;P&gt;Is there another reasone why so great stack sizes are needed when I use OpenMP?&lt;/P&gt;
&lt;P&gt;Best regards, David&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Oct 2008 23:22:17 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872747#M72705</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-06T23:22:17Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872748#M72706</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;Yes, OpenMP generally requires substantially more stack space than a serial program. All your PRIVATE data is stack allocated, as each thread needs a private copy (stacks are NOT shared by threads, heap is).&lt;/P&gt;
&lt;P&gt;I don't believe it's your C -&amp;gt; Fortran calling that is eating up stack space. It is the PRIVATE data in your OMP regions. You could try to revisit your declaration of data in the OMP parallel regions and see if there is data that can be made shared, but in many cases you really do want PRIVATE data (for data safety and correctness).&lt;/P&gt;
&lt;P&gt;ron&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Oct 2008 20:56:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872748#M72706</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2008-10-07T20:56:24Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872749#M72707</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/160574"&gt;Ronald Green (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Yes, OpenMP generally requires substantially more stack space than a serial program. All your PRIVATE data is stack allocated, as each thread needs a private copy (stacks are NOT shared by threads, heap is).&lt;/P&gt;
&lt;P&gt;I don't believe it's your C -&amp;gt; Fortran calling that is eating up stack space. It is the PRIVATE data in your OMP regions. You could try to revisit your declaration of data in the OMP parallel regions and see if there is data that can be made shared, but in many cases you really do want PRIVATE data (for data safety and correctness).&lt;/P&gt;
&lt;P&gt;ron&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks Ron!&lt;/P&gt;
&lt;P&gt;I think it is strange that a program that needs 10 MB stack all of a sudden needs 136 MB of stack. Anyway, I have some follow up Q's. My parallellized do loop contains two subroutine where the first provides input to the second. That is in the first subroutine some large vectors are filled and supplied to the second subroutine for crunching.&lt;/P&gt;
&lt;P&gt;My private variable list is quite large (~20 variables where 8 is vectors of a few thousand elements). I managed to cut the vector lengths in half but that had no effect on the stack size.&lt;/P&gt;
&lt;P&gt;How can vectors of a few thousands integers cause a stack size of 150 MB?&lt;/P&gt;
&lt;P&gt;Why didn't I get a reduction in stack size when I cut the vectors in half?&lt;/P&gt;
&lt;P&gt;The first subroutine in my parallellized calls some other subroutines, could that be a reason for the big stack?&lt;/P&gt;
&lt;P&gt;Would it be better to allocate the long private vectors dynamically per thread instead?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;That was quite a few questions. Thanks for taking the time to answer them!!&lt;/P&gt;
&lt;P&gt;/david&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 16:07:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872749#M72707</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-08T16:07:05Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872750#M72708</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/173322"&gt;david.eriksson@se.nucletron.com&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks Ron!&lt;/P&gt;
&lt;P&gt;I think it is strange that a program that needs 10 MB stack all of a sudden needs 136 MB of stack. Anyway, I have some follow up Q's. My parallellized do loop contains two subroutine where the first provides input to the second. That is in the first subroutine some large vectors are filled and supplied to the second subroutine for crunching.&lt;/P&gt;
&lt;P&gt;My private variable list is quite large (~20 variables where 8 is vectors of a few thousand elements). I managed to cut the vector lengths in half but that had no effect on the stack size.&lt;/P&gt;
&lt;P&gt;How can vectors of a few thousands integers cause a stack size of 150 MB?&lt;/P&gt;
&lt;P&gt;Why didn't I get a reduction in stack size when I cut the vectors in half?&lt;/P&gt;
&lt;P&gt;The first subroutine in my parallellized calls some other subroutines, could that be a reason for the big stack?&lt;/P&gt;
&lt;P&gt;Would it be better to allocate the long private vectors dynamically per thread instead?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;That was quite a few questions. Thanks for taking the time to answer them!!&lt;/P&gt;
&lt;P&gt;/david&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 19:54:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872750#M72708</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2008-10-08T19:54:16Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872751#M72709</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/173322"&gt;david.eriksson@se.nucletron.com&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks Ron!&lt;/P&gt;
&lt;P&gt;I think it is strange that a program that needs 10 MB stack all of a sudden needs 136 MB of stack. Anyway, I have some follow up Q's. My parallellized do loop contains two subroutine where the first provides input to the second. That is in the first subroutine some large vectors are filled and supplied to the second subroutine for crunching.&lt;/P&gt;
&lt;P&gt;My private variable list is quite large (~20 variables where 8 is vectors of a few thousand elements). I managed to cut the vector lengths in half but that had no effect on the stack size.&lt;/P&gt;
&lt;P&gt;How can vectors of a few thousands integers cause a stack size of 150 MB?&lt;/P&gt;
&lt;P&gt;Why didn't I get a reduction in stack size when I cut the vectors in half?&lt;/P&gt;
&lt;P&gt;The first subroutine in my parallellized calls some other subroutines, could that be a reason for the big stack?&lt;/P&gt;
&lt;P&gt;Would it be better to allocate the long private vectors dynamically per thread instead?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;That was quite a few questions. Thanks for taking the time to answer them!!&lt;/P&gt;
&lt;P&gt;/david&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 19:54:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872751#M72709</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2008-10-08T19:54:29Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872752#M72710</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;If you have local arrays, /Qopenmp changes their default allocation from static to stack. Each thread then will allocate those arrays on its stack for each subroutine in the parallel region.&lt;/P&gt;</description>
      <pubDate>Wed, 08 Oct 2008 19:58:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872752#M72710</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2008-10-08T19:58:42Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872753#M72711</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/367365"&gt;tim18&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;If you have local arrays, /Qopenmp changes their default allocation from static to stack. Each thread then will allocate those arrays on its stack for each subroutine in the parallel region.&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks for the update!&lt;/P&gt;
&lt;P&gt;I still think it's strange that a 10 MB stack turns into 136 MB stack when I am using OpenMP. The total memory size of the private variable list is &amp;lt;1 MB!&lt;/P&gt;
&lt;P&gt;What about common blocks, are they reallocated per thread even if they are not declared PRIVATE?&lt;/P&gt;
&lt;P&gt;What happens if I allocate some private variables dynamically outside the threaded area? Will they also be reallocated on the stack per thread?&lt;/P&gt;
&lt;P&gt;Why didn't I get a reduction in stack size when I reduced the vectors to half their size?&lt;/P&gt;
&lt;P&gt;Confusing!&lt;/P&gt;
&lt;P&gt;/david&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Oct 2008 08:47:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872753#M72711</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-09T08:47:58Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872754#M72712</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;David,&lt;/P&gt;
&lt;P&gt;My guess as to the excessive stack consumption is after your C code passes a pointer to its dataset your Fortran code is rearranging it either explicitly or implicitly (compiler creating stack temp array for some operations). You need to identify where these occurances are happening and eliminate them. To eliminate them use THREADPRIVATE to contain unallocated the array descriptors (or pointers to array descriptors). Then during run time have each thread allocate the arrays to the extent they require. To find the problem areas, specify a smaller stack size and run to the choke point. As you identify the arrays, move the descriptors into the THREADPRIVATE area (then add the allocations).&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 09 Oct 2008 12:46:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872754#M72712</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-09T12:46:33Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872755#M72713</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/160574"&gt;Ronald Green (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Yes, OpenMP generally requires substantially more stack space than a serial program. All your PRIVATE data is stack allocated, as each thread needs a private copy (stacks are NOT shared by threads, heap is).&lt;/P&gt;
&lt;P&gt;I don't believe it's your C -&amp;gt; Fortran calling that is eating up stack space. It is the PRIVATE data in your OMP regions. You could try to revisit your declaration of data in the OMP parallel regions and see if there is data that can be made shared, but in many cases you really do want PRIVATE data (for data safety and correctness).&lt;/P&gt;
&lt;P&gt;ron&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi again!&lt;/P&gt;
&lt;P&gt;I managed to solve the stack size by allocating several vectors dynamically instead of statically. I ended up with a stack size of about 10 MB. Is that reasonably? Can the stack size affect the performance in anyway?&lt;/P&gt;
&lt;P&gt;A related question, will the stack size affect the threading efficiency. What kind of overhead is it to allocate the stack for each thread?&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 20 Oct 2008 14:29:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872755#M72713</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-20T14:29:16Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872756#M72714</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/173322"&gt;davva&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Hi again!&lt;/P&gt;
&lt;P&gt;I managed to solve the stack size by allocating several vectors dynamically instead of statically. I ended up with a stack size of about 10 MB. Is that reasonably? Can the stack size affect the performance in anyway?&lt;/P&gt;
&lt;P&gt;A related question, will the stack size affect the threading efficiency. What kind of overhead is it to allocate the stack for each thread?&lt;/P&gt;
&lt;P&gt;Thanks!&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Another question!!!&lt;/P&gt;
&lt;P&gt;The arrays that I am accessing frequently in the inner most loop are now allocated dynamically. Will that affect the performance? What is faster accessing heap or stack memory? I might have some cache misses but not many.&lt;/P&gt;
&lt;P&gt;Thanx!&lt;/P&gt;
&lt;P&gt;/davva&lt;/P&gt;</description>
      <pubDate>Tue, 28 Oct 2008 21:11:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872756#M72714</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-28T21:11:49Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872757#M72715</link>
      <description>&lt;P&gt;There is no difference accessing, but there is a small overhead for allocating and deallocating heap memory. If your loops are large, the time spent to do this should be inconsequential.&lt;/P&gt;</description>
      <pubDate>Tue, 28 Oct 2008 21:39:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872757#M72715</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2008-10-28T21:39:43Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872758#M72716</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;davva,&lt;/P&gt;
&lt;P&gt;If the allocation/deallocation in your inner loop is creating unwanted overhead consider creating a ThreadPrivate array where on entry to your inner level you check the size of the private array, if too small or not allocated you delete and allocate (or reallocate) otherwise use the existing allocation (truncated to desired size). You can also create an array of arrays and index that by OpenMP thread number. *** Caution, not advisible if using nested threads. Either use a thread private array or a thread private unique index into the array of arrays.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 29 Oct 2008 19:42:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872758#M72716</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-29T19:42:24Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872759#M72717</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;P&gt;As for passing an std::vector&lt;FLOAT&gt; on to Fortran. The vector storage scheme is purposely opaque. Data is intended to be accessed by way of member functions only and not by indepentend pointer and index. There is no requirement for the floats to be stored together and thus permit you to pass the address of the first float plus size on to a Fortran subroutine for use. Although your version of std::vector could today use a contiguous array, std::vector could change at the next software update. A new and improved version could break your code. Example: Assume you were to making a programming "improvement" by converting to parallel programming techniquesby using tbb::concurrent_vector&lt;FLOAT&gt; (this is a new and improvedthread-safe vector container) then you are "almost" guaranteeing that the floats are in discontiguous chunks after you reach the number of floats that exceed thefirst bucket threshold. Initial testing with small numbers of floats will work, but later testing will fail (usually at your customer site).If your computational requirements are high then consider not using containers (that do not provide forever gauranteedaccess to the address of a contiguous array). It is not very hard for you to make your own containers for your own purposes that do exactly what you want.&lt;/FLOAT&gt;&lt;/FLOAT&gt;&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 29 Oct 2008 20:03:24 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872759#M72717</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2008-10-29T20:03:24Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872760#M72718</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/99850"&gt;jimdempseyatthecove&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;davva,&lt;/P&gt;
&lt;P&gt;If the allocation/deallocation in your inner loop is creating unwanted overhead consider creating a ThreadPrivate array where on entry to your inner level you check the size of the private array, if too small or not allocated you delete and allocate (or reallocate) otherwise use the existing allocation (truncated to desired size). You can also create an array of arrays and index that by OpenMP thread number. *** Caution, not advisible if using nested threads. Either use a thread private array or a thread private unique index into the array of arrays.&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;HI Jim!&lt;/P&gt;
&lt;P&gt;Thanx for your answer.&lt;/P&gt;
&lt;P&gt;I have already implemented an array of arrays and are accessing those by thread index. So it was a good caution you mentioned. The threaded code is today not netsted but you never now in the future. So how are the threads indexed if they are nested? How do I create a thread unique index?&lt;/P&gt;
&lt;P&gt;The original question was whether or not accessing large arrays are faster or slower if they are allocated on the heap? Allocation are done outside of the inner loops.&lt;/P&gt;
&lt;P&gt;/davva&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 30 Oct 2008 08:48:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872760#M72718</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-30T08:48:45Z</dc:date>
    </item>
    <item>
      <title>Re: Stack size in fortran using OpenMP</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872761#M72719</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/99850"&gt;jimdempseyatthecove&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;As for passing an std::vector&lt;FLOAT&gt; on to Fortran. The vector storage scheme is purposely opaque. Data is intended to be accessed by way of member functions only and not by indepentend pointer and index. There is no requirement for the floats to be stored together and thus permit you to pass the address of the first float plus size on to a Fortran subroutine for use. Although your version of std::vector could today use a contiguous array, std::vector could change at the next software update. A new and improved version could break your code. Example: Assume you were to making a programming "improvement" by converting to parallel programming techniquesby using tbb::concurrent_vector&lt;FLOAT&gt; (this is a new and improvedthread-safe vector container) then you are "almost" guaranteeing that the floats are in discontiguous chunks after you reach the number of floats that exceed thefirst bucket threshold. Initial testing with small numbers of floats will work, but later testing will fail (usually at your customer site).If your computational requirements are high then consider not using containers (that do not provide forever gauranteedaccess to the address of a contiguous array). It is not very hard for you to make your own containers for your own purposes that do exactly what you want.&lt;/FLOAT&gt;&lt;/FLOAT&gt;&lt;/P&gt;
&lt;P&gt;Jim Dempsey&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;Hi Jim!&lt;/P&gt;
&lt;P&gt;We have been relying heavily on std::vectors having contigous memory and it has worked so far (using Microsoft STL). Thanks for the heads up, I will read up on MS latest STL and see what they guarantee.&lt;/P&gt;
&lt;P&gt;/david&lt;/P&gt;</description>
      <pubDate>Thu, 30 Oct 2008 08:51:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-size-in-fortran-using-OpenMP/m-p/872761#M72719</guid>
      <dc:creator>davva</dc:creator>
      <dc:date>2008-10-30T08:51:19Z</dc:date>
    </item>
  </channel>
</rss>

