<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic The following should also in Intel® Fortran Compiler</title>
    <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060832#M117499</link>
    <description>&lt;P&gt;The following should also work:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;      Assoc_Arr: ASSOCIATE ( tmp1 =&amp;gt; array(i0:j0), tmp2 =&amp;gt; array(i1:j1) )
         tmp1 = tmp2
      END ASSOCIATE Assoc_Arr
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 30 Jan 2015 00:21:28 GMT</pubDate>
    <dc:creator>FortranFan</dc:creator>
    <dc:date>2015-01-30T00:21:28Z</dc:date>
    <item>
      <title>Stack overflow on array copy</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060828#M117495</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I'm using "&lt;/SPAN&gt;Intel(R) Visual Fortran Compiler XE for applications running on IA-32, Version 15.0.0.108 Build 20140726"&lt;/P&gt;

&lt;P&gt;My program craches with an stack-overflow. On standard-error there's a stack-trace pointing to a codeline, where a part of an array is copied:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;          arr(i0: j0)= arr(i1: j1)&lt;/PRE&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I read somewhere, that the compilter has to create a copy of the copied portion because the compiler cannot dtermine, wether source- and the target-memory do overlap (&lt;/SPAN&gt;&lt;A href="https://software.intel.com/en-us/node/524873)" target="_blank"&gt;https://software.intel.com/en-us/node/524873)&lt;/A&gt;.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Actually I &lt;/SPAN&gt;&lt;STRONG style="font-size: 1em; line-height: 1.5;"&gt;do know&lt;/STRONG&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;,&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt; that they do not overlap. Is there a way to give this "promise" to the compiler to force the comiler to creatinon-copy-code?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Benedikt&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jan 2015 14:10:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060828#M117495</guid>
      <dc:creator>Benedikt_R_</dc:creator>
      <dc:date>2015-01-29T14:10:47Z</dc:date>
    </item>
    <item>
      <title>I am not aware of ways to</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060829#M117496</link>
      <description>&lt;P&gt;I am not aware of ways to tell the compiler to avoid the copy in this case. I suggest compiling with /heap-arrays (Fortran &amp;gt; Optimization &amp;gt; Heap Arrays &amp;gt; 0)&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jan 2015 15:11:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060829#M117496</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2015-01-29T15:11:19Z</dc:date>
    </item>
    <item>
      <title>DO CONCURRENT (I = i0:j0)</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060830#M117497</link>
      <description>&lt;P&gt;DO CONCURRENT (I = i0:j0)&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp; arr(I) = arr(i1-i0+I)&lt;/P&gt;

&lt;P&gt;END DO&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jan 2015 15:59:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060830#M117497</guid>
      <dc:creator>JVanB</dc:creator>
      <dc:date>2015-01-29T15:59:00Z</dc:date>
    </item>
    <item>
      <title>The DO CONCURRENT helps the</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060831#M117498</link>
      <description>&lt;P&gt;The DO CONCURRENT helps the compiler decide that the loop is safe to vectorize.&lt;/P&gt;</description>
      <pubDate>Thu, 29 Jan 2015 20:45:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060831#M117498</guid>
      <dc:creator>Steven_L_Intel1</dc:creator>
      <dc:date>2015-01-29T20:45:53Z</dc:date>
    </item>
    <item>
      <title>The following should also</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060832#M117499</link>
      <description>&lt;P&gt;The following should also work:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;      Assoc_Arr: ASSOCIATE ( tmp1 =&amp;gt; array(i0:j0), tmp2 =&amp;gt; array(i1:j1) )
         tmp1 = tmp2
      END ASSOCIATE Assoc_Arr
&lt;/PRE&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jan 2015 00:21:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060832#M117499</guid>
      <dc:creator>FortranFan</dc:creator>
      <dc:date>2015-01-30T00:21:28Z</dc:date>
    </item>
    <item>
      <title>Hmmm... I didn't think that</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060833#M117500</link>
      <description>&lt;P&gt;Hmmm... I didn't think that ASSOCIATE had rules about aliasing like procedures do. Let's try an experiment:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;program P
&amp;nbsp;&amp;nbsp; implicit none
&amp;nbsp;&amp;nbsp; integer arr(10)
&amp;nbsp;&amp;nbsp; integer i
&amp;nbsp;&amp;nbsp; integer, parameter :: arr0(size(arr)) = [(i,i=1,size(arr))]
&amp;nbsp;&amp;nbsp; integer i0,j0,i1,j1

&amp;nbsp;&amp;nbsp; write(*,'(a)') 'Array assignment'
&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; arr(i0:j0) = arr(i1:j1)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; arr(i0:j0) = arr(i1:j1)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; write(*,'(a)') 'Associate'
&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; associate(T0 =&amp;gt; arr(i0:j0), T1 =&amp;gt; arr(i1:j1))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; T0 = T1
&amp;nbsp;&amp;nbsp; end associate
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; associate(T0 =&amp;gt; arr(i0:j0), T1 =&amp;gt; arr(i1:j1))
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; T0 = T1
&amp;nbsp;&amp;nbsp; end associate
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; write(*,'(a)') 'Subroutine'
&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; call copy(arr(i0:j0),arr(i1:j1),j0-i0+1)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; call copy(arr(i0:j0),arr(i1:j1),j0-i0+1)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; write(*,'(a)') 'Forall'
&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; forall(i=i0:j0) arr(i) = arr(i1-i0+i)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; forall(i=i0:j0) arr(i) = arr(i1-i0+i)
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; write(*,'(a)') 'Do concurrent'
&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; do concurrent(i=i0:j0)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; arr(i) = arr(i1-i0+i)
&amp;nbsp;&amp;nbsp; end do
&amp;nbsp;&amp;nbsp; write(*,5) arr

&amp;nbsp;&amp;nbsp; arr = arr0
&amp;nbsp;&amp;nbsp; call set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; do concurrent(i=i0:j0)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; arr(i) = arr(i1-i0+i)
&amp;nbsp;&amp;nbsp; end do
&amp;nbsp;&amp;nbsp; write(*,5) arr

5 format(*(i0:1x))
&amp;nbsp;&amp;nbsp; contains
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; subroutine copy(arr0,arr1,n)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; integer n
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; integer arr0(n),arr1(n)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; arr0 = arr1
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; end subroutine copy
end program P

subroutine set1(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; i0 = 3
&amp;nbsp;&amp;nbsp; j0 = 5
&amp;nbsp;&amp;nbsp; i1 = 4
&amp;nbsp;&amp;nbsp; j1 = 6
end subroutine set1

subroutine set2(i0,j0,i1,j1)
&amp;nbsp;&amp;nbsp; i0 = 5
&amp;nbsp;&amp;nbsp; j0 = 7
&amp;nbsp;&amp;nbsp; i1 = 4
&amp;nbsp;&amp;nbsp; j1 = 6
end subroutine set2
&lt;/PRE&gt;

&lt;P&gt;Output with ifort:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;Array assignment
1 2 4 5 6 6 7 8 9 10
1 2 3 4 4 5 6 8 9 10
Associate
1 2 4 5 6 6 7 8 9 10
1 2 3 4 4 4 4 8 9 10
Subroutine
1 2 4 5 6 6 7 8 9 10
1 2 3 4 4 4 4 8 9 10
Forall
1 2 4 5 6 6 7 8 9 10
1 2 3 4 4 5 6 8 9 10
Do concurrent
1 2 4 5 6 6 7 8 9 10
1 2 3 4 4 4 4 8 9 10&lt;/PRE&gt;

&lt;P&gt;So ASSOCIATE has the same result as the subroutine and DO CONCURRENT which do have aliasing rules. I get similar results for gfortran. Where does it talk about aliasing rules for the ASSOCIATE construct in the standard?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jan 2015 02:03:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060833#M117500</guid>
      <dc:creator>JVanB</dc:creator>
      <dc:date>2015-01-30T02:03:44Z</dc:date>
    </item>
    <item>
      <title>Quote:Repeat Offender wrote:</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060834#M117501</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;Repeat Offender wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;...&lt;/P&gt;

&lt;P&gt;So ASSOCIATE has the same result as the subroutine and DO CONCURRENT which do have aliasing rules. I get similar results for gfortran. Where does it talk about aliasing rules for the ASSOCIATE construct in the standard?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;I assume you've referred to&amp;nbsp;J3/04-007 Latest Working Draft of the Fortran 2003 standard, May 10 2014: sections 8.1.4 and 16.4.1.5. &amp;nbsp;Now I've a tough time reading these standards documents. so it's The Fortran 2003 Handbook by Adams et al. to the rescue: section 8.2.2 of this book says about the association during the execution of the ASSOCIATE construct, "This process is somewhat similar to what happens in a procedure call with the associate name taking the role of the dummy argument." &amp;nbsp;In the context in this thread, I personally would prefer ASSOCIATE over DO CONCURRENT.&lt;/P&gt;</description>
      <pubDate>Fri, 30 Jan 2015 05:13:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060834#M117501</guid>
      <dc:creator>FortranFan</dc:creator>
      <dc:date>2015-01-30T05:13:08Z</dc:date>
    </item>
    <item>
      <title>Should you compile with</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060835#M117502</link>
      <description>&lt;P&gt;Should you compile with&amp;nbsp;&lt;SPAN style="box-sizing: border-box; color: rgb(102, 102, 102); font-family: Arial, Tahoma, Helvetica, sans-serif; font-size: 14px; line-height: 15.27272605896px;"&gt;Qparallel&lt;/SPAN&gt;&lt;SPAN style="color: rgb(102, 102, 102); font-family: Arial, Tahoma, Helvetica, sans-serif; font-size: 14px; line-height: 15.27272605896px;"&gt;&amp;nbsp;&lt;/SPAN&gt;&amp;nbsp;or is that implicit in Qmkl?&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 16.3636360168457px;"&gt;If you're not enabling auto-parallel, there is no benefit to DO CONCURRENT I believe.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 31 Jan 2015 12:56:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060835#M117502</guid>
      <dc:creator>andrew_4619</dc:creator>
      <dc:date>2015-01-31T12:56:00Z</dc:date>
    </item>
    <item>
      <title>Here's another test using</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060836#M117503</link>
      <description>&lt;P&gt;Here's another test using floating point variables you can consider:&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;   PROGRAM p

      USE, INTRINSIC :: ISO_FORTRAN_ENV, ONLY : I4 =&amp;gt; INT32, DP =&amp;gt; REAL64

      !..
      IMPLICIT NONE

      !..
      INTEGER(I4), PARAMETER :: MAXARR = 2**28
      INTEGER(I4), PARAMETER :: MAXREPEAT = 5
      INTEGER(I4) :: i0
      INTEGER(I4) :: j0
      INTEGER(I4) :: i1
      INTEGER(I4) :: j1
      INTEGER(I4) :: Istat
      INTEGER(I4) :: I
      INTEGER(I4) :: Counter
      INTEGER(I4) :: Overlap
      REAL(DP), PARAMETER :: EPSILON_DP = EPSILON(1.0_dp)
      REAL(DP), ALLOCATABLE :: array(:)
      REAL(DP) :: Start_Time = 0_dp
      REAL(DP) :: End_Time = 0_dp
      REAL(DP) :: CpuTimes_Assign(MAXREPEAT)
      REAL(DP) :: CpuTimes_DoConcurrent(MAXREPEAT)
      REAL(DP) :: CpuTimes_Associate(MAXREPEAT)
      CHARACTER(LEN=*), PARAMETER :: FMT_CPU = "(A, T40, F8.3, A)"
      CHARACTER(LEN=2048) :: ErrorAlloc

      !..
      ALLOCATE(array(MAXARR), SOURCE=0.0_dp, STAT=Istat, ERRMSG=ErrorAlloc)
      IF (Istat /= 0) THEN
         PRINT *, " Allocation of array failed: ", ErrorAlloc(1:LEN_TRIM(ErrorAlloc))
         STOP
      END IF

      !..
      Overlap = 0
      PRINT *, " ** With Overlap of ", Overlap
      i0 = 1
      j0 = MAXARR/2
      i1 = j0 + 1 - Overlap
      j1 = i1 + MAXARR/2 - 1

      CpuTimes_Assign = 0.0_dp
      CpuTimes_DoConcurrent = 0.0_dp
      CpuTimes_Associate = 0.0_dp

      PRINT *, "Array assignment:"
      Loop_Repeat_Assign: DO Counter = 1, MAXREPEAT

         PRINT *, "   Trial ", Counter

         !.. Initialize the array using random numbers
         CALL RANDOM_NUMBER(array)

         !..
         CALL CPU_TIME(Start_Time)

         !..
         array(i0:j0) = array(i1:j1)

         CALL CPU_TIME(End_Time)

         !..
         CpuTimes_Assign(Counter) = (End_Time - Start_Time)

         !..
         IF (ABS(array(j0)-array(j1)) &amp;gt; EPSILON_DP) THEN
            PRINT *, " Copy failed."
            CYCLE Loop_Repeat_Assign
         END IF

         IF (Counter == 1) THEN
            PRINT *, " array(i0) = ", array(i0)
            PRINT *, " array(j0) = ", array(j0)
            PRINT *, " array(i1) = ", array(i1)
            PRINT *, " array(j1) = ", array(j1)
         END IF
         
         WRITE(*, FMT=FMT_CPU) "   CPU Time: ", CpuTimes_Assign(Counter), " seconds."

      END DO Loop_Repeat_Assign

      PRINT *, "DO CONCURRENT:"
      Loop_Repeat_DO: DO Counter = 1, MAXREPEAT

         PRINT *, "   Trial ", Counter

         !.. Initialize the array using random numbers
         CALL RANDOM_NUMBER(array)

         !..
         CALL CPU_TIME(Start_Time)

         !..
         DO CONCURRENT ( I = i0:j0 )
            array(I) = array(i1 - i0 + I)
         END DO

         CALL CPU_TIME(End_Time)

         !..
         CpuTimes_DoConcurrent(Counter) = (End_Time - Start_Time)

         !..
         IF (ABS(array(j0)-array(j1)) &amp;gt; EPSILON_DP) THEN
            PRINT *, " Copy failed."
            CYCLE Loop_Repeat_DO
         END IF

         IF (Counter == 1) THEN
            PRINT *, " array(i0) = ", array(i0)
            PRINT *, " array(j0) = ", array(j0)
            PRINT *, " array(i1) = ", array(i1)
            PRINT *, " array(j1) = ", array(j1)
         END IF
         
         WRITE(*, FMT=FMT_CPU) "   CPU Time: ", CpuTimes_DoConcurrent(Counter), " seconds."

      END DO Loop_Repeat_DO

      PRINT *, "ASSOCIATE:"
      Loop_Repeat_Assoc: DO Counter = 1, MAXREPEAT

         PRINT *, "   Trial ", Counter

         !.. Initialize the array using random numbers
         CALL RANDOM_NUMBER(array)

         !..
         CALL CPU_TIME(Start_Time)

         Assoc_Arr: ASSOCIATE ( tmp1 =&amp;gt; array(i0:j0), tmp2 =&amp;gt; array(i1:j1) )
            tmp1 = tmp2
         END ASSOCIATE Assoc_Arr

         CALL CPU_TIME(End_Time)

         !..
         CpuTimes_Associate(Counter) = (End_Time - Start_Time)

         !..
         IF (ABS(array(j0)-array(j1)) &amp;gt; EPSILON_DP) THEN
            PRINT *, " Copy failed."
            CYCLE Loop_Repeat_Assoc
         END IF

         IF (Counter == 1) THEN
            PRINT *, " array(i0) = ", array(i0)
            PRINT *, " array(j0) = ", array(j0)
            PRINT *, " array(i1) = ", array(i1)
            PRINT *, " array(j1) = ", array(j1)
         END IF
         
         WRITE(*, FMT=FMT_CPU) "   CPU Time: ", CpuTimes_Associate(Counter), " seconds."

      END DO Loop_Repeat_Assoc

      !..
      WRITE(*, FMT=FMT_CPU) "Array Assignment: Average CPU Time ",                                  &amp;amp;
                            SUM(CpuTimes_Assign)/REAL(MAXREPEAT, KIND=DP),  " seconds."

      !..
      WRITE(*, FMT=FMT_CPU) "DO CONCURRENT:    Average CPU Time ",                                  &amp;amp;
                           SUM(CpuTimes_DoConcurrent)/REAL(MAXREPEAT, KIND=DP),                     &amp;amp;
                           " seconds."

      !..
      WRITE(*, FMT=FMT_CPU) "ASSOCIATE:        Average CPU Time ",                                  &amp;amp;
                           SUM(CpuTimes_Associate)/REAL(MAXREPEAT, KIND=DP),                        &amp;amp;
                           " seconds."

      !..
      DEALLOCATE(array, STAT=Istat, ERRMSG=ErrorAlloc)
      IF (Istat /= 0) THEN
         PRINT *, " Deallocation of array failed. ", ErrorAlloc(1:LEN_TRIM(ErrorAlloc))
         STOP
      END IF

      !..
      STOP

   END PROGRAM p
&lt;/PRE&gt;

&lt;P&gt;The results I observe:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;  ** With Overlap of  0
 Array assignment:
    Trial  1
  array(i0) =  2.649255667434409E-003
  array(j0) =  0.829067121479056
  array(i1) =  2.649255667434409E-003
  array(j1) =  0.829067121479056
   CPU Time:                              0.499 seconds.
    Trial  2
   CPU Time:                              0.499 seconds.
    Trial  3
   CPU Time:                              0.499 seconds.
    Trial  4
   CPU Time:                              0.499 seconds.
    Trial  5
   CPU Time:                              0.484 seconds.
 DO CONCURRENT:
    Trial  1
  array(i0) =  0.657694300591952
  array(j0) =  0.556066471741371
  array(i1) =  0.657694300591952
  array(j1) =  0.556066471741371
   CPU Time:                              0.640 seconds.
    Trial  2
   CPU Time:                              0.624 seconds.
    Trial  3
   CPU Time:                              0.640 seconds.
    Trial  4
   CPU Time:                              0.655 seconds.
    Trial  5
   CPU Time:                              0.593 seconds.
 ASSOCIATE:
    Trial  1
  array(i0) =  0.865167717234815
  array(j0) =  0.555618009170298
  array(i1) =  0.865167717234815
  array(j1) =  0.555618009170298
   CPU Time:                              0.203 seconds.
    Trial  2
   CPU Time:                              0.203 seconds.
    Trial  3
   CPU Time:                              0.218 seconds.
    Trial  4
   CPU Time:                              0.203 seconds.
    Trial  5
   CPU Time:                              0.203 seconds.
Array Assignment: Average CPU Time        0.496 seconds.
DO CONCURRENT:    Average CPU Time        0.630 seconds.
ASSOCIATE:        Average CPU Time        0.206 seconds.

&lt;/PRE&gt;

&lt;P&gt;Compiled with:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;ifort /c /nologo /O3 /Qparallel /heap-arrays0 /standard-semantics /stand:f08
/traceback /libs:static /threads&lt;/PRE&gt;

&lt;P&gt;My observations generally have been that in order for DO CONCURRENT to be effective, the computational intensity or the array sizes need to be above a certain threshold; otherwise, the overhead of "setting up" the parallel operations can overwhelm the benefits.&lt;/P&gt;</description>
      <pubDate>Sun, 01 Feb 2015 01:34:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060836#M117503</guid>
      <dc:creator>FortranFan</dc:creator>
      <dc:date>2015-02-01T01:34:00Z</dc:date>
    </item>
    <item>
      <title>To post #13 program I added</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060837#M117504</link>
      <description>&lt;P&gt;To post #13 program I added&lt;/P&gt;

&lt;PRE class="brush:fortran;"&gt;...
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; SUBROUTINE COPY8(b0,e0,b1,e1)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; INTEGER b0,e0,b1,e1
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; INTEGER i,b
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; b = b1-b0
!DIR$ IVDEP
!DIR$ VECTOR NONTEMPORAL
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DO I = b0,e0
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Arr(i) = Arr(i+b)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; END DO
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; END SUBROUTINE
...
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; DO K=1,rep
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CALL Copy8(1,e0,b1,i)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; CALL Copy8(b1,i,1,e0)
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; END DO
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Call STOPWATCH('LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL')
&lt;/PRE&gt;

&lt;P&gt;Results:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8388608
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.12 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.62 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.41 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.16 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.12 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.13 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.08 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 16777216
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.21 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.78 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.23 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.16 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 33554432
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.01 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.49 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.43 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.55 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.46 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.47 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.49 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.31 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 67108864
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.01 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.96 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4.85 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3.07 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.93 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.92 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.97 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.63 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&lt;/PRE&gt;

&lt;P&gt;Clearly a winner&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Sun, 01 Feb 2015 15:54:55 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060837#M117504</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-02-01T15:54:55Z</dc:date>
    </item>
    <item>
      <title>The above timings was without</title>
      <link>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060838#M117505</link>
      <description>&lt;P&gt;The above timings was without /Qparallel and with the sequential MKL&lt;/P&gt;

&lt;P&gt;The following is with /Qparallel and the parallel MKL&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 8388608
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.12 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.64 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.40 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.16 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.12 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.12 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.08 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 16777216
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.31 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.77 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.22 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.24 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.16 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 33554432
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.01 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.48 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.45 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.54 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.46 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.47 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.48 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.31 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&amp;nbsp;Size:&amp;nbsp;&amp;nbsp;&amp;nbsp; 67108864
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.01 INIT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00 Array Operator
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.97 DO CONCURRENT
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4.93 Classical Loop
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3.07 Array Operator: Different Arrays
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.91 BLAS
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.92 LOOP WITH !DIR$ IVDEP
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.97 associate
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.63 LOOP WITH !DIR$ IVDEP and !DIR$ VECTOR NONTEMPORAL
&lt;/PRE&gt;

&lt;P&gt;BLAS marginally improved, DO CONCURRENT is inconclusive, IVDEP with NONTEMPORAL is the winner at 1.44x faster than BLAS.&lt;/P&gt;

&lt;P&gt;Note, the above result is not to be taken as a generalization, rather it is for the specific conditions of the test program.&lt;/P&gt;

&lt;P&gt;Jim Dempsey&lt;/P&gt;</description>
      <pubDate>Sun, 01 Feb 2015 16:05:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Fortran-Compiler/Stack-overflow-on-array-copy/m-p/1060838#M117505</guid>
      <dc:creator>jimdempseyatthecove</dc:creator>
      <dc:date>2015-02-01T16:05:56Z</dc:date>
    </item>
  </channel>
</rss>

