<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Large Arrays: using &amp;quot;allocatable&amp;quot; versus static declaration in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741773#M1158</link>
    <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
And as for memory versus data files: memory reads are around 3 orders of magnitude faster.&lt;BR /&gt;&lt;BR /&gt;So why not put all the data hard-coded in the program using DATA statements or other explicit initialization?&lt;BR /&gt;&lt;BR /&gt;Certainly for data that does not change over time this is something to consider. For example, phase tables of water (steam tables) will never change over time (hopefully). Good candidate for putting into a program in a table without reading from disk. &lt;BR /&gt;&lt;BR /&gt;On the other extreme: data that define a particular problem, and will be changed frequently to define new problems. You don't want to have to edit files, recompile, etc for each problem you run. These should be read in from disk. &lt;BR /&gt;&lt;BR /&gt;There are data sets that fall in between these two - materials tables for steel, composites, ceramics. These are updated regularly as new materials are added by manufacturers and older materials are removed from the market. I've seen these hardcoded and also read in from material property data files. If the code models a particular widget and that widget's material sets are fixed and unchanging - could hardcode them. If the widgets get redesigned once a year and new materials come and go - data file.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
    <pubDate>Thu, 03 Dec 2009 16:52:47 GMT</pubDate>
    <dc:creator>Ron_Green</dc:creator>
    <dc:date>2009-12-03T16:52:47Z</dc:date>
    <item>
      <title>Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741765#M1150</link>
      <description>I am just a scientific programmer, so I hope someone out there with CS knowledge can educate me on a problem I am having.&lt;BR /&gt;&lt;BR /&gt;I have a 1000 x 1000 x 1000 array (for this test - in other cases it may need to be different)&lt;BR /&gt;&lt;BR /&gt;If I use this code:&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt; IMPLICIT NONE&lt;BR /&gt; DOUBLE PRECISION, allocatable :: stupid(:,:,:)&lt;BR /&gt; DOUBLE PRECISION, allocatable :: stupid2(:,:,:)&lt;BR /&gt; ALLOCATE(stupid(1000,1000,1000),stupid2(1000,1000,1000))&lt;BR /&gt;stupid = 0&lt;BR /&gt; print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test&lt;BR /&gt;&lt;BR /&gt;The program hangs and bogs my whole system ( a new 64 bit dell with plenty of memory)&lt;BR /&gt;&lt;BR /&gt;But if I do the same thing,with static declarations, it's super fast!&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt; IMPLICIT NONE&lt;BR /&gt; DOUBLE PRECISION :: stupid(1000,1000,1000)&lt;BR /&gt; stupid = 0&lt;BR /&gt; print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test &lt;BR /&gt;&lt;BR /&gt;Why is that???&lt;BR /&gt;&lt;BR /&gt;I'm sure I've used large allocatable arrays before, though only 2 dimensions. &lt;BR /&gt;&lt;BR /&gt;I'm probably not understanding some difference in the amount of memory needed for one or the other. By my naive calculation, a 16 bit double, if the array was 1e9 (that's 1000^3) would take up 2 GB of space. That's a lot, but not more than my system can handle (I have 4 gigs of memory) - and still doesn't explain why the static is so fast and has no problems??&lt;BR /&gt;</description>
      <pubDate>Mon, 30 Nov 2009 15:19:02 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741765#M1150</guid>
      <dc:creator>etmeyer</dc:creator>
      <dc:date>2009-11-30T15:19:02Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741766#M1151</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/444060"&gt;etmeyer&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;I am just a scientific programmer, so I hope someone out there with CS knowledge can educate me on a problem I am having.&lt;BR /&gt;&lt;BR /&gt;I have a 1000 x 1000 x 1000 array (for this test - in other cases it may need to be different)&lt;BR /&gt;&lt;BR /&gt;If I use this code:&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt; IMPLICIT NONE&lt;BR /&gt; DOUBLE PRECISION, allocatable :: stupid(:,:,:)&lt;BR /&gt; DOUBLE PRECISION, allocatable :: stupid2(:,:,:)&lt;BR /&gt; ALLOCATE(stupid(1000,1000,1000),stupid2(1000,1000,1000))&lt;BR /&gt;stupid = 0&lt;BR /&gt; print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test&lt;BR /&gt;&lt;BR /&gt;The program hangs and bogs my whole system ( a new 64 bit dell with plenty of memory)&lt;BR /&gt;&lt;BR /&gt;But if I do the same thing,with static declarations, it's super fast!&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt; IMPLICIT NONE&lt;BR /&gt; DOUBLE PRECISION :: stupid(1000,1000,1000)&lt;BR /&gt; stupid = 0&lt;BR /&gt; print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test &lt;BR /&gt;&lt;BR /&gt;Why is that???&lt;BR /&gt;&lt;BR /&gt;I'm sure I've used large allocatable arrays before, though only 2 dimensions. &lt;BR /&gt;&lt;BR /&gt;I'm probably not understanding some difference in the amount of memory needed for one or the other. By my naive calculation, a 16 bit double, if the array was 1e9 (that's 1000^3) would take up 2 GB of space. That's a lot, but not more than my system can handle (I have 4 gigs of memory) - and still doesn't explain why the static is so fast and has no problems??&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;DOUBLE PRECISION use 8 bytes each, not 16bits. &lt;BR /&gt;&lt;BR /&gt;And in the first example, you have 2 arrays whereas in the second you have just one.&lt;BR /&gt;&lt;BR /&gt;Make sure your OS is a 64bit version: uname -a&lt;BR /&gt;&lt;BR /&gt;uname -a&lt;BR /&gt;Linux spdr65 2.6.16.60-0.21-smp #1 SMP Tue May 6 12:41:02 UTC 2008 x86_64 x86_64 x86_64 GNU/Linux&lt;BR /&gt;&lt;BR /&gt;look for "_64" for a 64bit OS.&lt;BR /&gt;&lt;BR /&gt;Compile with:&lt;BR /&gt;&lt;BR /&gt;-mcmodel=medium -shared-intel&lt;BR /&gt;&lt;BR /&gt;to monitor your memory usage, use another window on the system and run:&lt;BR /&gt;&lt;BR /&gt;vmstat 2&lt;BR /&gt;&lt;BR /&gt;watching columns "free" and all 3 columns under "swap".&lt;BR /&gt;&lt;BR /&gt;ron&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Mon, 30 Nov 2009 16:00:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741766#M1151</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2009-11-30T16:00:37Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741767#M1152</link>
      <description>Thank you for the correction. Either way, the total memory allocated (theoretically) should not be the problem unless there is a drastic difference between static and allocatable that I'm not appreciating.&lt;BR /&gt;&lt;BR /&gt;I installed 64 bit ubuntu 9.10 on this computer myself - but here you are:&lt;BR /&gt;&lt;BR /&gt;[0802][meyer@inspiron17]$ uname -a&lt;BR /&gt;Linux inspiron17 2.6.31-15-generic #50-Ubuntu SMP Tue Nov 10 14:53:52 UTC 2009 x86_64 GNU/Linux&lt;BR /&gt;&lt;BR /&gt;I have 4 gigs of memory.&lt;BR /&gt;&lt;BR /&gt;I did the vmstat test for the second (allocatable) example. The static example is so fast I can't run vmstat after a.out, though I could start it before perhaps.&lt;BR /&gt;&lt;BR /&gt;output:&lt;BR /&gt;[1044][meyer@inspiron17]$ vmstat 2&lt;BR /&gt;procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----&lt;BR /&gt; r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa&lt;BR /&gt; 2 14 359272  26064   1560 339024    6   25    78   217  432  458 10  6 82  2&lt;BR /&gt; 0 14 406448  26552   1404 337480  900 23666  3516 23666 1764 1943  7  6 14 72&lt;BR /&gt; 4  9 452524  25404    344 328284  838 23140  2304 23140 2049 1962  8 12 31 49&lt;BR /&gt; 0 20 482300  26632    332 321184 1524 15044  2768 15044 2158 1933 11  7  0 82&lt;BR /&gt; 0 17 500800  26044    332 317976 1570 9450  2634  9450 1925 1750  4  7  0 88&lt;BR /&gt; 0 15 517424  26000    712 318500 1136 8532  2408  8532 2112 1917  9  9  0 81&lt;BR /&gt; 0 17 551240  25832    724 308024  792 17058  1304 17066 3851 1650  4 15  0 81&lt;BR /&gt; 1 20 600468  26192    856 303600 1476 24836  2716 24948 2458 1754  3 10  0 88&lt;BR /&gt; 0 15 607724  26016    856 305840 2046 3864  3428  3864 1966 1673  4  8  0 89&lt;BR /&gt; 0 14 644244  26204    812 303604 1536 18654  2454 18700 3794 1771  3 13  0 84&lt;BR /&gt; 0 13 700480  25232    604 302780 1026 28306  2162 28546 2239 1644  3 13  0 84&lt;BR /&gt; 0 13 731684  26236    596 298632  928 15766  1996 15862 2897 2121  3 11  0 86&lt;BR /&gt; 1 13 840248  25824    716 265092  204 54316   466 54328 5541 1839  3 24 22 51&lt;BR /&gt; 0 17 876140  30544    712 253800 1596 18092  2492 18128 2473 2960  6 12  0 82&lt;BR /&gt; 1 14 918636  25724    992 247460 1212 21532  1962 21588 2762 3414  8 16  2 74&lt;BR /&gt; 0  9 993676  26612    884 235944  650 37580   986 37580 2786 2563  6 15 16 64&lt;BR /&gt; 0  8 1056992  26028    884 199512  524 31720  1006 31720 3558 3178  7 16 25 52&lt;BR /&gt; 2  9 1128396  26276    876 194960  636 35816   790 35838 3171 2923  7 15 31 47&lt;BR /&gt; 0 12 1180856  27540    876 185420 1326 26440  1374 26454 2957 2327  5 12  1 83&lt;BR /&gt; 2 12 1233432  26552    776 168260 1010 26404  2250 26438 2542 3412  7 15  0 78&lt;BR /&gt; 1  9 1259012  25848    712 160884 1310 12920  2498 12920 2377 2577  4 10  0 86&lt;BR /&gt;procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----&lt;BR /&gt; r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa&lt;BR /&gt; 1 10 1290292  26088    876 149932 1284 15736  2152 15736 2366 2918  6 11  0 83&lt;BR /&gt; 0  9 1396524  26496    736 145704  262 53164   446 53288 5344 3064  8 21 16 54&lt;BR /&gt; 0 13 1436492  26872    732 147736 1606 20070  2044 20070 2674 2757  5 11  7 76&lt;BR /&gt; 0 14 1457844  25108    712 147028 1806 10802  2574 10802 1908 2058  4 10  0 86&lt;BR /&gt; 0 14 1465196  26428    684 145140 1648 3750  3368  3750 1944 2435  5  8  0 87&lt;BR /&gt; 0 13 1475860  40496    336 147772 2078 5482  3244  5564 2045 2817  5 11  7 77&lt;BR /&gt; 0  9 1476396  43908    720 152168 1648  388  3074   388 1667 3209  8  8 35 50&amp;lt;&amp;lt;--- KILLED THE TEST HERE&lt;BR /&gt; 0  6 792224 3501452    720 155172 1898    0  3972     0 1522 3580  6 16 37 41 &lt;BR /&gt; 0 13 791636 3496500    896 154900 1838    0  3410    42 1477 3352  8  7 32 54&lt;BR /&gt; 0  9 791044 3489060   1300 158388 1898    0  3906     0 1501 2573  4  5 34 57&lt;BR /&gt; 0  7 790252 3479468   1308 160988 2020    0  3380     0 1770 2392  6  6 39 50&lt;BR /&gt; 1  8 789084 3462340   1460 176572 2446    0  2808    20 2121 2739  6  8 23 63&lt;BR /&gt; 0  6 787800 3461704   1460 171328 2868    0  3560     0 2173 3582  6  7 39 48&lt;BR /&gt; 0  6 786824 3454608   1464 174560 2118    0  3556     0 1972 2215  3  7 39 51&lt;BR /&gt; 2  2 784776 3448552   1480 175536 2376    0  2886    30 2244 4216  8  6 36 50&lt;BR /&gt;^C&lt;BR /&gt;[1047][meyer@inspiron17]$ &lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 01 Dec 2009 16:50:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741767#M1152</guid>
      <dc:creator>etmeyer</dc:creator>
      <dc:date>2009-12-01T16:50:48Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741768#M1153</link>
      <description>I did the vmstat for the static case as well, just started it before the execution:&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;[1047][meyer@inspiron17]$ vmstat 2&lt;BR /&gt;procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----&lt;BR /&gt; r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa&lt;BR /&gt; 1  0 606068 3117092  29384 269668   14   52    89   240  444  485 10  6 82  2&lt;BR /&gt; 5  0 605172 3117844  29384 268100  372    0   372     0 1726 4973 29  7 63  1&lt;BR /&gt; 0  0 605168 3107284  29392 277872    0    0     0    24 2042 4152 12  8 79  0&lt;BR /&gt; 0  0 605164 3111064  29400 273528    2    0     2    12 1936 4046  9  7 82  1&lt;BR /&gt; 2  0 605164 3113236  29400 271412    6    0     6     0 1737 3674  9  5 86  0&lt;BR /&gt; 0  1 605104 3098116  29556 285388   34    0  5866     0 1858 3651  9  7 62 22 &amp;lt;-- approximate start&lt;BR /&gt; 2  0 605100 3084252  29576 299360    0    0  8244    38 1821 3738 11  7 47 36&lt;BR /&gt; 0  0 605100 3079400  29576 304628   28    0   364     0 1754 4245  9 10 79  2 &amp;lt;-- approximate end&lt;BR /&gt; 0  0 605100 3077964  29576 306172    0    0     0     2 1582 3300  8  5 87  0&lt;BR /&gt; 0  0 605096 3081140  29584 301980    0    0     0    18 1294 3280  7  7 86  0&lt;BR /&gt; 3  0 605096 3086752  29592 295948    0    0     0    68 1530 3724 12  7 81  1&lt;BR /&gt;^C&lt;BR /&gt;[1052][meyer@inspiron17]$ &lt;BR /&gt;&lt;BR /&gt;Sorry for the formatting loss - not sure how to correct that. The free memory never dips like it does for the allocatable case. So it's clearly taking a ton more memory to use allocatable. I'm assuming these numbers are in kB. In the allocatable test, available memory jumps from a mere 40 MB to over 3 GB after killing the executable. The largest swap value I saw was only around 38 MB for allocatable, and only 8 MB for the static... I'm not sure how to interpret most of this. where is the available memory going?&lt;BR /&gt;&lt;BR /&gt;Even with two 1000^3 arrays, total memory usage should be around 2 GB.&lt;BR /&gt;&lt;BR /&gt;Maybe the questions is not why the allocatable hogs so much, but why the static does not?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 01 Dec 2009 17:02:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741768#M1153</guid>
      <dc:creator>etmeyer</dc:creator>
      <dc:date>2009-12-01T17:02:42Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741769#M1154</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Static data is stored in the static data segment BSS, as you can see by the 'size' command:&lt;BR /&gt;&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; ifort -o static -mcmodel=medium -shared-intel static.f90 -O0&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; size static&lt;BR /&gt; text  data  bss  dec  hex filename&lt;BR /&gt; 2655  696 4000000032 4000003383 ee6b3537 static&lt;BR /&gt;&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; ifort -o alloc -mcmodel=medium -shared-intel alloc.f90 -O0&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; size alloc&lt;BR /&gt; text  data  bss  dec  hex filename&lt;BR /&gt; 3587  840  8  4435  1153 alloc&lt;BR /&gt;&lt;BR /&gt;We generally discourage using static data. Many operating systems, including Windows versions, limit the static segment to 2GB. Dynamically allocated data will not have such restrictions.&lt;BR /&gt;&lt;BR /&gt;Now given that your total data allocation is the same in both cases AT -O0 I cannot explain why one version would thrash your system and other would not. Note that I said "at -O0". Optimization can play tricks, however. For example, in the static case:&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt;IMPLICIT NONE&lt;BR /&gt;REAL :: stupid(1000,1000,1000)&lt;BR /&gt;stupid = 0.0_4&lt;BR /&gt;print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test &lt;BR /&gt;&lt;BR /&gt;(I had to switch to single precision since my system only has 6GB.) Also note the explicit typing of the constant 0.0_4 to KIND(4).&lt;BR /&gt;&lt;BR /&gt;Now, what is interesting is that -O2, the compiler figures out that you really never use the data in 'stupid', so it doesn't actually allocate it in BSS at all! Observe:&lt;BR /&gt;&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; ifort -o static -mcmodel=medium -shared-intel static.f90 -O2&lt;BR /&gt;rwgreen@spdr65:~/quad/rwgreen/forums/70229&amp;gt; size static&lt;BR /&gt; text  data  bss  dec  hex filename&lt;BR /&gt; 2419  696  8  3123  c33 static&lt;BR /&gt;&lt;BR /&gt;Thus, this version of the code runs instantly because it never does initialize the data in stupid. What I can't explain is why the allocateable version can't figure out the same optimization:&lt;BR /&gt;&lt;BR /&gt;program test&lt;BR /&gt;IMPLICIT NONE&lt;BR /&gt;integer i,j,k&lt;BR /&gt;REAL, allocatable :: stupid(:,:,:)&lt;BR /&gt;ALLOCATE(stupid(1000,1000,1000))&lt;BR /&gt;stupid = 0.0_4&lt;BR /&gt;print*, "shape is ", SHAPE(stupid)&lt;BR /&gt;end program test&lt;BR /&gt;&lt;BR /&gt;If you compile this at -O2 (or leave off the -O option, which will get you the default of -O2) it still does the allocation and initialization and hence runs considerably more slowly. I would guess the compiler sees the ALLOCATE and figures you REALLY REALLY do want to allocate the data and thus does not optimize it away in the way the static case did.&lt;BR /&gt;&lt;BR /&gt;The takeaway here - if you actually use the stupid array in the code, the 2 cases should run approximately equally. There will be some difference in runtime behaviour. The static data is 'allocated' at program load - the loader carves up a chunk of data in BSS for the array at load time. The allocateable case makes an OS 'malloc()' call to allocate the data in heap (different space in the process address space) after the process starts.&lt;BR /&gt;&lt;BR /&gt;Other effects can come into play also from optimization. The compiler may choose to use a vectorized library call, intel_fast_memset() to do the initialization, OR it may explicitly create a vectorized loop, OR it may decide to use a serial nested loop the way you would do this the old F77 way.&lt;BR /&gt;&lt;BR /&gt;Since you are a real scientist trying to do work rather than a computer scientist interested in the inners of compiler optimization, let's step back a few steps and evaluate your goals: Are you trying to determine if it's better to use static arrays versus using allocatable arrays? I think that is what you are trying to answer, yes? I've seen many a good physicist thrashing around on simple tests like this that really don't mimic their code, and end up being misled by obscure compiler optimizations causing anomalous behavior that the real code would never exhibit.&lt;BR /&gt;&lt;BR /&gt;I assume you have some real code that does real work. Forget this little testcase. There should be no performance disadvantage to using allocatable arrays versus static arrays in a real code. There also will not be any performance advantage. It should be a wash. The advantage of allocatable, again, is that there is no limit to the size of your arrays. Static, as I mentioned, is limited to 2GB by various Linux distros and by all Windows versions. For that reason alone you should choose dynamically allocated arrays.&lt;BR /&gt;&lt;BR /&gt;ron&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Dec 2009 23:15:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741769#M1154</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2009-12-02T23:15:43Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741770#M1155</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
Wow, very interesting. That explains the fact that my frustrated move of turning everything into static arrays did not help at all. They don't call it the curse of many dimensions for nothing... time to turn this project over to the supercomputer, I think.&lt;BR /&gt;&lt;BR /&gt;One other question, if the answer is readily available:&lt;BR /&gt;&lt;BR /&gt; I frequently come up with questions regarding optimization for which I am probably not educated enough to design good tests, and I'm sure the answers are out there. For example, I'm running a subroutine which repeatedly loads into memory a large array from a text file. This seems not ideal, since Fortran passes everything through pointers, yet it also causes re-usability issues to have a pointer passed to a subroutine when it is not needed externally. In any case, I have usually assumed that loading from a text file is a much slower process than keeping something in memory, but I don't really know. This is really basic, I realize, but also outside the scope of my self-education. Are there any references for this kind of thing? &lt;BR /&gt;&lt;BR /&gt;In any case, thanks very much for the help.&lt;BR /&gt;</description>
      <pubDate>Thu, 03 Dec 2009 03:35:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741770#M1155</guid>
      <dc:creator>etmeyer</dc:creator>
      <dc:date>2009-12-03T03:35:06Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741771#M1156</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/444060"&gt;etmeyer&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; Wow, very interesting. That explains the fact that my frustrated move of turning everything into static arrays did not help at all. They don't call it the curse of many dimensions for nothing... time to turn this project over to the supercomputer, I think.&lt;BR /&gt;&lt;BR /&gt;One other question, if the answer is readily available:&lt;BR /&gt;&lt;BR /&gt; I frequently come up with questions regarding optimization for which I am probably not educated enough to design good tests, and I'm sure the answers are out there. For example, I'm running a subroutine which repeatedly loads into memory a large array from a text file. This seems not ideal, since Fortran passes everything through pointers, yet it also causes re-usability issues to have a pointer passed to a subroutine when it is not needed externally. In any case, I have usually assumed that loading from a text file is a much slower process than keeping something in memory, but I don't really know. This is really basic, I realize, but also outside the scope of my self-education. Are there any references for this kind of thing? &lt;BR /&gt;&lt;BR /&gt;In any case, thanks very much for the help.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Your assumptions are correct: reading text files uses Fortran "formatted" IO. This is considerably slower than reading the data from an "unformatted" file. You will find the file formatting or "form" in the OPEN statement:&lt;BR /&gt;OPEN( unit=42,file="whatever",form="formatted" ....) or some similar OPEN&lt;BR /&gt;&lt;BR /&gt;the form="unformatted" is typically 1-2 orders of magnitude faster. It reads/writes data in the native binary format of the computer you are on. &lt;BR /&gt;&lt;BR /&gt;So if form="unformatted" is 1-2 orders of magnitude faster, why would anyone use form="formatted"? Well, a text file or formatted file can be read on any computer and is portable. Scientists need to be able to share data files with colleagues and using a text file guarantees that no matter what computer someone has this file can be read. The unfortunate thing is that form="unformatted" is both system dependent and compiler dependent. The order of storing bytes from a 4 or 8 byte real varies in a property called 'endian' - computers are either little-endian or big-endian meaning that the bytes are stored from the low-end up (little endian) or from the high-end down (big endian). PCs are little-endian but older SGI and other systems were big-endian. So form="unformatted" files could not be shared between these two systems without some conversion mechanism to swap byte orders. To see the extent of the problem, look up the CONVERT= specifier on the OPEN statement supported by Intel Fortran. You'll see we've had to set up conversion methods for a variety of other formats. And to make matters worse, even on a platform like a PC, compiler vendors do not agree on how to set up file and record markers within unformatted files. Thus, an unformatted file created by IFORT may not be able to be read by gfortran, g95 or PGI. &lt;BR /&gt;&lt;BR /&gt;But there is hope. You can continue to use formatted text files and just accept the slowness as a tradeoff with portability and sharing. OR there are at least 2 efforts to make vendor/platform neutral data files along with service routines to create/read these file formats. If you are interested, take a look at HDF and NetCDF projects. These 2 file IO layers are used extensively in the scientific community:&lt;BR /&gt;&lt;BR /&gt;NetCDF:&lt;A href="http://www.unidata.ucar.edu/software/netcdf/"&gt; http://www.unidata.ucar.edu/software/netcdf/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;HDF group: &lt;A href="http://www.hdfgroup.org/HDF5/"&gt;http://www.hdfgroup.org/HDF5/&lt;/A&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 03 Dec 2009 16:34:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741771#M1156</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2009-12-03T16:34:35Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741772#M1157</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/160574"&gt;Ronald W. Green (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
And to make matters worse, even on a platform like a PC, compiler vendors do not agree on how to set up file and record markers within unformatted files. Thus, an unformatted file created by IFORT may not be able to be read by gfortran, g95 or PGI. &lt;BR /&gt;&lt;A href="http://www.hdfgroup.org/HDF5/"&gt;&lt;/A&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
ifort, PGI, and gfortran seem to have reached reasonable compatibility on linux x86_64. Don't count on it for obsolete versions, however. gfortran for Windows is still trying to solve the problem of supporting&amp;gt;2GB files, but I wouldn't count on hdf5 working there either.&lt;BR /&gt;</description>
      <pubDate>Thu, 03 Dec 2009 16:46:48 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741772#M1157</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-12-03T16:46:48Z</dc:date>
    </item>
    <item>
      <title>Re: Large Arrays: using "allocatable" versus static declaration</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741773#M1158</link>
      <description>&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
And as for memory versus data files: memory reads are around 3 orders of magnitude faster.&lt;BR /&gt;&lt;BR /&gt;So why not put all the data hard-coded in the program using DATA statements or other explicit initialization?&lt;BR /&gt;&lt;BR /&gt;Certainly for data that does not change over time this is something to consider. For example, phase tables of water (steam tables) will never change over time (hopefully). Good candidate for putting into a program in a table without reading from disk. &lt;BR /&gt;&lt;BR /&gt;On the other extreme: data that define a particular problem, and will be changed frequently to define new problems. You don't want to have to edit files, recompile, etc for each problem you run. These should be read in from disk. &lt;BR /&gt;&lt;BR /&gt;There are data sets that fall in between these two - materials tables for steel, composites, ceramics. These are updated regularly as new materials are added by manufacturers and older materials are removed from the market. I've seen these hardcoded and also read in from material property data files. If the code models a particular widget and that widget's material sets are fixed and unchanging - could hardcode them. If the widgets get redesigned once a year and new materials come and go - data file.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Thu, 03 Dec 2009 16:52:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Large-Arrays-using-quot-allocatable-quot-versus-static/m-p/741773#M1158</guid>
      <dc:creator>Ron_Green</dc:creator>
      <dc:date>2009-12-03T16:52:47Z</dc:date>
    </item>
  </channel>
</rss>

