<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Also, If you have memory in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180620#M148609</link>
    <description>&lt;P&gt;Also, If you have memory deallocation debugging enabled (e.g. valgrind, or Windows debug&amp;nbsp;C Runtime Library) this may have an effect on free (deallocate).&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
    <pubDate>Tue, 18 Sep 2018 14:50:36 GMT</pubDate>
    <dc:creator>jimdempseyatthecove</dc:creator>
    <dc:date>2018-09-18T14:50:36Z</dc:date>
    <item>
      <title>Slow deallocation in derived type data</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180615#M148604</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;

&lt;P&gt;I found a very slow speed during a deallocation of a derived data type that I can not fix.&lt;/P&gt;

&lt;P&gt;I'm using IVFC 16.0.3.207 for Windows, with default release compiler options.&lt;/P&gt;

&lt;P&gt;Here below an exemplification of my problem:&lt;/P&gt;

&lt;DIV&gt;program test_dealloc&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;implicit none&lt;/DIV&gt;

&lt;DIV&gt;integer(kind=4) :: n1 = 5827396&lt;/DIV&gt;

&lt;DIV&gt;integer(kind=4) :: n2 = 75&lt;/DIV&gt;

&lt;DIV&gt;integer(kind=4) :: i&lt;/DIV&gt;

&lt;DIV&gt;type ddata&lt;/DIV&gt;

&lt;DIV&gt;complex(kind=4), dimension(:,:), allocatable :: elem&lt;/DIV&gt;

&lt;DIV&gt;end type ddata&lt;/DIV&gt;

&lt;DIV&gt;type(ddata), pointer, dimension(:)&amp;nbsp; &amp;nbsp;:: ddata_p&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;real(4) time1,time0&lt;/DIV&gt;

&lt;DIV&gt;! start&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;time0=secnds(0.0)&lt;/DIV&gt;

&lt;DIV&gt;allocate(ddata_p(n1))&lt;/DIV&gt;

&lt;DIV&gt;do i=1,n1&lt;/DIV&gt;

&lt;DIV&gt;allocate(ddata_p(i)%elem(n2,n2))&lt;/DIV&gt;

&lt;DIV&gt;enddo&lt;/DIV&gt;

&lt;DIV&gt;time1=secnds(time0)&lt;/DIV&gt;

&lt;DIV&gt;print*, ' '&lt;/DIV&gt;

&lt;DIV&gt;print*, 'Allocation time', time1&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;! computation kernel&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;time0=secnds(0.0)&lt;/DIV&gt;

&lt;DIV&gt;do i=1,n1&lt;/DIV&gt;

&lt;DIV&gt;deallocate(ddata_p(i)%elem)&lt;/DIV&gt;

&lt;DIV&gt;enddo&lt;/DIV&gt;

&lt;DIV&gt;deallocate (ddata_p)&lt;/DIV&gt;

&lt;DIV&gt;time1=secnds(time0)&lt;/DIV&gt;

&lt;DIV&gt;print*, ' '&lt;/DIV&gt;

&lt;DIV&gt;print*, 'Deallocation time', time1&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;end&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;I've used the function secnds for the timing and the print on the screen is:&lt;/P&gt;

&lt;P&gt;&amp;gt;test_dealloc.exe&lt;/P&gt;

&lt;P&gt;&amp;nbsp;Allocation time&amp;nbsp; &amp;nbsp;15.62891&lt;/P&gt;

&lt;P&gt;&amp;nbsp;Deallocation time&amp;nbsp; &amp;nbsp;4075.336&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;The machine is a XEON E5-2690 x64 with 192 GB of RAM, the OS is Windows Server 2008 R2 Enterprise.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;From the task manager the allocated memory is around 44GB. The time for the deallocation is very large, much more than the allocation one and also much more of the computation kernel where the data structure is filled.&lt;/P&gt;

&lt;P&gt;Is there any possibility to reduce such a deallocation time?&lt;/P&gt;

&lt;P&gt;Thank you very much. Regards&lt;/P&gt;

&lt;P&gt;Paolo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Sep 2018 16:18:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180615#M148604</guid>
      <dc:creator>De_Vita__Paolo</dc:creator>
      <dc:date>2018-09-03T16:18:06Z</dc:date>
    </item>
    <item>
      <title>This issue sounds serious and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180616#M148605</link>
      <description>&lt;P&gt;This issue sounds serious and deserves consideration by Intel.&lt;/P&gt;</description>
      <pubDate>Tue, 11 Sep 2018 18:32:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180616#M148605</guid>
      <dc:creator>Andrew_Smith</dc:creator>
      <dc:date>2018-09-11T18:32:53Z</dc:date>
    </item>
    <item>
      <title>"From the task manager the</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180617#M148606</link>
      <description>&lt;P&gt;"From the task manager the allocated memory is around 44GB"&lt;/P&gt;

&lt;P&gt;I did a calc for n1 = 5827396 and n2 = 75 and get an estimate of 244 GB. I suggest you try n1 = 5827396/100, use the following changes and run with task manager.&lt;/P&gt;

&lt;PRE class="brush:fortran; class-name:dark;"&gt;&amp;nbsp; integer(kind=8) :: siza, sizi, sizet
&amp;nbsp; real*4 gb&amp;nbsp; 
&amp;nbsp; ! start
&amp;nbsp; 
&amp;nbsp; time0 = secnds(0.0)
&amp;nbsp; 
&amp;nbsp; allocate (ddata_p(n1))
&amp;nbsp; write (*,*) 'allocate ddata_p size =', sizeof (ddata_p)
&amp;nbsp; sizet = 0
&amp;nbsp; 
&amp;nbsp; do i=1,n1
&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp; allocate (ddata_p(i)%elem(n2,n2))
&amp;nbsp;&amp;nbsp;&amp;nbsp; sizet = sizet + sizeof ( ddata_p(i)%elem )
&amp;nbsp; 
&amp;nbsp; enddo
&amp;nbsp; 
&amp;nbsp; time1 = secnds(time0)
&amp;nbsp; gb = sizet / 1024.**3
&amp;nbsp; write (*,*) 'fill ddata_p size =', sizet, gb&amp;nbsp; ! c_sizeof (ddata_p)
&amp;nbsp; write (*,*) 'allocate ddata_p size =', sizeof (ddata_p)
&amp;nbsp; read (*,*) gb
&amp;nbsp;&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 08:44:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180617#M148606</guid>
      <dc:creator>John_Campbell</dc:creator>
      <dc:date>2018-09-18T08:44:07Z</dc:date>
    </item>
    <item>
      <title>Thank you John,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180618#M148607</link>
      <description>&lt;P&gt;Thank you John,&lt;/P&gt;

&lt;P&gt;You are rigth, the real size of the data structure is 244 GB, in fact if I try to initialize the variable I read in the task manager&amp;nbsp;the same allocated memory that you read too.&lt;/P&gt;

&lt;P&gt;I made this test on bigger machine (Intel Xeon Gold 6144 with 684 GB RAM OS Windows Server 2016 Datacenter) adding&amp;nbsp;the following&amp;nbsp;initialization:&lt;/P&gt;

&lt;PRE class="brush:fortran; class-name:dark;"&gt;...

time0=secnds(0.0)


do i=1,n1

ddata_p(i)%elem = 0

enddo

time1=secnds(time0)

print*, ' '

print*, 'Setting time', time1

...
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;FONT face="Consolas" size="2"&gt;&lt;FONT face="Consolas" size="2"&gt;And I obtained these new values:&lt;/FONT&gt;&lt;/FONT&gt;&lt;/P&gt;

&lt;PRE class="brush:bash; class-name:dark;"&gt;test_dealloc_setting.exe


&amp;nbsp;Allocation time&amp;nbsp;&amp;nbsp; 14.67578

&amp;nbsp;Setting time&amp;nbsp;&amp;nbsp; 79.22656


&amp;nbsp;Deallocation time&amp;nbsp;&amp;nbsp; 3202.836
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;The deallocation time is still very large.&lt;/P&gt;

&lt;P&gt;Paolo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 12:30:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180618#M148607</guid>
      <dc:creator>De_Vita__Paolo</dc:creator>
      <dc:date>2018-09-18T12:30:39Z</dc:date>
    </item>
    <item>
      <title>This looks like an issue with</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180619#M148608</link>
      <description>&lt;P&gt;This looks like an issue with the C++ heap manager and/or virtual memory paging&amp;nbsp;due to fragmentation issues. Try deallocating in reverse order of allocation:&lt;/P&gt;

&lt;PRE class="brush:fortran; class-name:dark;"&gt;time0=secnds(0.0)
do i=n1,1,-1
&amp;nbsp; deallocate(ddata_p(i)%elem)
enddo
deallocate (ddata_p)
time1=secnds(time0)
print*, ' '
print*, 'Deallocation time', time1&lt;/PRE&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 14:44:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180619#M148608</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2018-09-18T14:44:47Z</dc:date>
    </item>
    <item>
      <title>Also, If you have memory</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180620#M148609</link>
      <description>&lt;P&gt;Also, If you have memory deallocation debugging enabled (e.g. valgrind, or Windows debug&amp;nbsp;C Runtime Library) this may have an effect on free (deallocate).&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 14:50:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180620#M148609</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2018-09-18T14:50:36Z</dc:date>
    </item>
    <item>
      <title>Hi Jim,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180621#M148610</link>
      <description>&lt;P&gt;Hi Jim,&lt;/P&gt;

&lt;P&gt;I tried to deallocate in reverse order but nothing changed.&lt;/P&gt;

&lt;P&gt;Regarding the compiler options, I've left the&amp;nbsp;default ones for the release configuration, that are (from the command window of the compiler):&lt;/P&gt;

&lt;P&gt;/nologo /O2 /module:"x64\Release\\" /object:"x64\Release\\" /Fd"x64\Release\vc100.pdb" /libs:dll /threads /c&lt;/P&gt;

&lt;P&gt;Paolo&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 16:01:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180621#M148610</guid>
      <dc:creator>De_Vita__Paolo</dc:creator>
      <dc:date>2018-09-18T16:01:41Z</dc:date>
    </item>
    <item>
      <title>One possible solution is: </title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180622#M148611</link>
      <description>&lt;P&gt;One possible solution is:&amp;nbsp;&lt;/P&gt;

&lt;P&gt;1)&amp;nbsp;change elem into a pointer instead&amp;nbsp; of an allocatable array.&amp;nbsp;&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em;"&gt;2) allocate a single block of n1*n2*n2 elements.&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em;"&gt;3) set&amp;nbsp; ddata_pi(i)%elem to the initial element of each n2*n2 matrix&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;In this case you will have a single de-allocation.&lt;/P&gt;

&lt;P&gt;Regards&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2018 16:14:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180622#M148611</guid>
      <dc:creator>LRaim</dc:creator>
      <dc:date>2018-09-18T16:14:47Z</dc:date>
    </item>
    <item>
      <title>Hi Luigi,</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180623#M148612</link>
      <description>&lt;P&gt;Hi Luigi,&lt;/P&gt;

&lt;P&gt;Thank you for the suggestions. About the options:&lt;/P&gt;

&lt;P&gt;1) I tried with pointer but no real change&lt;/P&gt;

&lt;P&gt;2)&amp;nbsp;That&amp;nbsp;would be&amp;nbsp;really the worst solution for my case&amp;nbsp;because in the real&amp;nbsp;code n2 is depending from n1 and I don't know in advance the final dimensione of the&amp;nbsp;data, so the derived structure would be the best option for my application.&lt;/P&gt;

&lt;P&gt;3) I didn't understand, can you explain better please? (or make an example)&lt;/P&gt;

&lt;P&gt;Regards&lt;/P&gt;

&lt;P&gt;Paolo&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Sep 2018 07:21:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180623#M148612</guid>
      <dc:creator>De_Vita__Paolo</dc:creator>
      <dc:date>2018-09-19T07:21:59Z</dc:date>
    </item>
    <item>
      <title>Sorry for the hurry in</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180624#M148613</link>
      <description>&lt;P&gt;Sorry for the hurry in answering.&amp;nbsp;&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em;"&gt;The 3 points, I suggested, must be implemented together to get a possible solution .&lt;BR /&gt;
	About your point 2). The possible solution can be applied if&amp;nbsp; n2(i) for each i can be computed in a previous do-loop.&lt;/SPAN&gt;&lt;BR /&gt;
	&lt;SPAN style="font-size: 1em;"&gt;About your point 3). I do not have time to set-up an example. This step is similar to set N pointers to each column of a NxN matrix (after the matrix has been allocated).&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Regards&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Sep 2018 09:25:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180624#M148613</guid>
      <dc:creator>LRaim</dc:creator>
      <dc:date>2018-09-19T09:25:22Z</dc:date>
    </item>
    <item>
      <title>Do you have VTune available</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180625#M148614</link>
      <description>&lt;P&gt;Do you have VTune available to perform a performance test?&lt;/P&gt;

&lt;P&gt;About the only thing left to check is there used to be an Intel Floating License check issue causing long delays in a program. Though I do not recall it being related to deallocation.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Wed, 19 Sep 2018 11:49:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180625#M148614</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2018-09-19T11:49:47Z</dc:date>
    </item>
    <item>
      <title>This is a very serious 300</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180626#M148615</link>
      <description>&lt;P&gt;This is a very serious 300 fold drop in performance. Why has it not been aknowledged by Intel as an issue?&lt;/P&gt;</description>
      <pubDate>Wed, 19 Sep 2018 11:58:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180626#M148615</guid>
      <dc:creator>Andrew_Smith</dc:creator>
      <dc:date>2018-09-19T11:58:01Z</dc:date>
    </item>
    <item>
      <title>I took the program in #1 and</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180627#M148616</link>
      <description>&lt;P&gt;I took the program in #1 and built it. I reduced n1 by a factor of 100 because the machine had nowhere near the memory you asked for.&lt;/P&gt;

&lt;P&gt;The allocation was 0.21s and the dealloc 0.08s. Our machines will have different speeds but broadly speaking the alloc time is in proportion with your bigger n1 value, the dealloc time however is not and is orders out!!!!&lt;/P&gt;

&lt;P&gt;I would suggest running your test at several increasing values of n1 I guess we will see some threshold value at which there is a step change in the dealloc time. This might show something interesting.&lt;/P&gt;

&lt;P&gt;The limit (if that is what we see) might be a function of your windows and/or hardware or it may be some limit within the compiler.....&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Sep 2018 15:13:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180627#M148616</guid>
      <dc:creator>andrew_4619</dc:creator>
      <dc:date>2018-09-19T15:13:44Z</dc:date>
    </item>
    <item>
      <title>My initial thoughts we the</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180628#M148617</link>
      <description>&lt;P&gt;My initial thoughts we the large delay was due to a virtual memory event being initiated during deallocation. Physical memory is not used during allocation, but when the array is being used, perhaps at deallocation. I notice the initialisation phase has now been added to the tester, which appears to indicate the deallocation delay is not associated with virtual memory usage. Given the memory sizes being tested, I am also not able to reproduce this problem. The luxury of having this problem !&lt;/P&gt;</description>
      <pubDate>Thu, 20 Sep 2018 01:14:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Slow-deallocation-in-derived-type-data/m-p/1180628#M148617</guid>
      <dc:creator>John_Campbell</dc:creator>
      <dc:date>2018-09-20T01:14:53Z</dc:date>
    </item>
  </channel>
</rss>

