Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

StackOverflow wtih function pack() for big arrays

Denis_Bannikov
Beginner
747 Views

Hi @all,

Could you please comment on the following issue?

I need to convert 2D (or 3D) array to 1D array.

If I use simple do-loop construction everything works fine.

If I use function pack() then for big array size I get StackOverflow. One workaround is to increase stack size in VisualStudio. But this is not the best solution for general case.

 

I like using standard functions due to their brevity, but I got some examples when they could not work due to stack size, and I have to switch back to do-loops.

 

My Example (see attached files):

Main programs call subroutine Array_2D_to_1D for different array sizes (2**k, 2**k)

If I set to use function pack() by setting IsPackUse = .true.

I got

Start calc for k=           8

Start calc for k=           9

forrtl: severe (170): Program Exception - stack overflow

If I set to use do-loops (IsPackUse = .false.)  it works Fine

I got

 Start calc for k=           8

 Start calc for k=           9

 Start calc for k=          10

 Start calc for k=          11

 Start calc for k=          12

 Start calc for k=          13

 Start calc for k=          14

 Program end

Press any key to continue . . .

 

0 Kudos
14 Replies
andrew_4619
Honored Contributor II
747 Views

I would guess that pack creates a temporary mask array filled with .true. at uis equal in size to the array being packed.

You could not use stack with the "heap arrays" compopiler option or maybe the reshape function would work better than pack in your case.

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
747 Views

consider using TRANSFER

loc1D(:) = TRANSFER(b2D,loc1D) ! untested, or
loc1D(:) = TRANSFER(b2D,0.0_8,size(b2D)) ! untested

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
747 Views

Note, transfer may do the unnecessary stack temporary thing.

Jim Dempsey

0 Kudos
FortranFan
Honored Contributor II
747 Views

@Denis Bannikov,

Does the example code you show in the original post closely reflect your actual case?  Meaning the source data array of rank 2 is allocated by an ALLOCATE statement and where you need to convert it to a rank one array is in 'read' mode in a procedure with an explicit interface, as in a module procedure?  If so, have you considered making use of a local pointer with contiguous attribute pointing to the dummy argument with contiguous and target attribute?  Then no copying will be involved?

0 Kudos
Denis_Bannikov
Beginner
747 Views

andrew_4619 wrote:

I would guess that pack creates a temporary mask array filled with .true. at uis equal in size to the array being packed.

You could not use stack with the "heap arrays" compopiler option or maybe the reshape function would work better than pack in your case.

I think so about creating the temporary mask  array on stack. It a pity.

I tried to use RESHAPE() (this was my first try). Results is the same as with pack()

loc1D(:) = reshape (b2D, (/nx*ny/))

 Start calc for k=           8 (256)
 Start calc for k=           9 (512)
forrtl: severe (170): Program Exception - stack overflow

0 Kudos
Denis_Bannikov
Beginner
747 Views

jimdempseyatthecove wrote:

consider using TRANSFER

loc1D(:) = TRANSFER(b2D,loc1D) ! untested, or
loc1D(:) = TRANSFER(b2D,0.0_8,size(b2D)) ! untested

Jim Dempsey

Jim, I've tried both variants. It dies even earlier than pack() or reshape(),

 Start calc for k=           8 (256)
forrtl: severe (170): Program Exception - stack overflow

0 Kudos
Denis_Bannikov
Beginner
747 Views

FortranFan wrote:

@Denis Bannikov,

Does the example code you show in the original post closely reflect your actual case?  Meaning the source data array of rank 2 is allocated by an ALLOCATE statement and where you need to convert it to a rank one array is in 'read' mode in a procedure with an explicit interface, as in a module procedure?  If so, have you considered making use of a local pointer with contiguous attribute pointing to the dummy argument with contiguous and target attribute?  Then no copying will be involved?

FortranFan, The current code is close to my post, but much more complicated ( ~10 different 1D, 2D, 3D arrays; sections; arrays are defined in structures in another modules, pointers, etc).

"the source data array of rank 2 is allocated by an ALLOCATE statement and where you need to convert it to a rank one array is in 'read' mode in a procedure with an explicit interface, as in a module procedure" - Yes.

The old code was without explicit interface.

Passing of 2D array into 1D in old code was simple: one defined 1D array in a subroutine and conversion from b2D to loc1D occurred automatically (it worked for any array size). But without explicit interfaces passing array c2D (with sections) created a temporary copy of this array (and, of course, I was getting StackOverflow). To eliminate temporary arrays I made a module and both issues had gone (StackOverflow and Temporary arrays creation). Then I faced with the issue that I was not able to pass 2D to 1D in the same manner as in the old code without explicit interface.

error #6634: The shape matching rules of actual arguments and dummy arguments have been violated. [B2D]

//The only one goal of getting this array is to copy it to another 1D array (which is a pointer)

I avoid using pointers so I have not considered options with local pointers. But as this code already contains some pointers, Can you please elaborate on it?

old program
// b2D(1:nx,1:ny), c2D(1:nx,1:ny)
     call SubA( nx-2, ny-2, b2D( 1:nx, 1:ny ), c2D ( 2:(nx-1), 2:(nx-1) ), data)
end old program

another file:
subroutine SubA (nx, ny, loc1D, loc_c2D, data)
    real*8, intent(in) :: loc1D ( nx*ny )
    real*8, intent(in) :: loc_c2D ( nx, ny )  ! temporary copy is created !
    data -- a structure of pointers, intent(out)
    ...
    allocate( data(L) % p(nx*nx) )
    data(L)%p(1:nx*ny) = loc1D(1:nx*ny) ! loc1D is only required to pass data to 1D array p() in structure data
    ...
ens subroutine SubA 

0 Kudos
jimdempseyatthecove
Honored Contributor III
747 Views
    program TwoDto1D
        implicit none
        real, allocatable :: array2D(:,:)
        integer :: i,j

        allocate(array2D(5,6))
        do j=1,6
            do i=1,5
                array2D(i,j) = i*100+j
            enddo
        enddo
        print *,"array2D"
        print *, array2D
        call as1D(array2D,size(array2D))
    end program TwoDto1D

    subroutine as1D(array1D, n)
        implicit none
        real :: array1D(n)
        integer :: n
        print *,"array1D"
        print *,array1D
    end subroutine as1D
 array2D
   101.0000       201.0000       301.0000       401.0000       501.0000
   102.0000       202.0000       302.0000       402.0000       502.0000
   103.0000       203.0000       303.0000       403.0000       503.0000
   104.0000       204.0000       304.0000       404.0000       504.0000
   105.0000       205.0000       305.0000       405.0000       505.0000
   106.0000       206.0000       306.0000       406.0000       506.0000
 array1D
   101.0000       201.0000       301.0000       401.0000       501.0000
   102.0000       202.0000       302.0000       402.0000       502.0000
   103.0000       203.0000       303.0000       403.0000       503.0000
   104.0000       204.0000       304.0000       404.0000       504.0000
   105.0000       205.0000       305.0000       405.0000       505.0000
   106.0000       206.0000       306.0000       406.0000       506.0000
Press any key to continue . . .

Jim Dempsey

0 Kudos
FortranFan
Honored Contributor II
747 Views

Denis Bannikov wrote:

.. FortranFan, ..

I avoid using pointers so I have not considered options with local pointers. But as this code already contains some pointers, Can you please elaborate on it? ..

@Denis Bannikov,

  • See Jim's response in Message #9 and see if argument association rules in standard Fortran toward an automatic-explicit-shape array as shown by JIm can meet your needs.
  • Re: the snippet from 'old code' you shown in Message #8, can you explain what is the 4th dummy argument of loc_c2D for?  That's where you mark the temporary copy being created but it's unclear how this argument is used in the code.  Is it handled the same way as loc_c1D, meaning contents are copied to a pointer in data?
  • Separately on the 5th argument of data, the structure of pointers, does the pointer really need to be allocated and hold copies of data from b2D (and perhaps c2D) actual argument?  Or can they simply be pointers and point to b2D (and C2D) target?
0 Kudos
Denis_Bannikov
Beginner
747 Views

jimdempseyatthecove wrote:

    program TwoDto1D
        call as1D(array2D,size(array2D))
    end program TwoDto1D

    subroutine as1D(array1D, n)
        real :: array1D(n)
        integer :: n

Jim, thank you!

It works. It is the same as in my post above, and how it was in the original code. I do not know why I did not check this way of passing the array from the beginning. The issue is solved.

 

But I'm still confuse that functions like pack(), transfer(), where-else ... depend on the stack size, so one can not rely on them while building robust code.

0 Kudos
Denis_Bannikov
Beginner
747 Views

FortranFan wrote:

  • See Jim's response in Message #9 and see if argument association rules in standard Fortran toward an automatic-explicit-shape array as shown by JIm can meet your needs.
  • Re: the snippet from 'old code' you shown in Message #8, can you explain what is the 4th dummy argument of loc_c2D for?  That's where you mark the temporary copy being created but it's unclear how this argument is used in the code.  Is it handled the same way as loc_c1D, meaning contents are copied to a pointer in data?
  • Separately on the 5th argument of data, the structure of pointers, does the pointer really need to be allocated and hold copies of data from b2D (and perhaps c2D) actual argument?  Or can they simply be pointers and point to b2D (and C2D) target?

  • Thank, you! Jim's advise helped me.
  • This is an initalization subroutine. All input arrays are copied into multiple 1D arrays and then proceed to a black box (it seems to me the solver works with vectors).
  • It would be the best options. But this is a multigrid solver. The subroutine initialize initial conditions for all levels by copying input arrays to each level of data structure; + computing auxiliary information.


 

0 Kudos
Denis_Bannikov
Beginner
747 Views

H-m-m-m
Everything works fine on a single call 2D to 1D.

I checked on the original code and found that combination of "call # 1": Pass a section of 2D array to subA; "call # 2": from subA pass 2D to 1D arrays; creates a temporary array at call#2 and produces a stack overflow for array = 512x512 (when creating this copy)
// do not forget to enable: Yes (/check:arg_temp_created)

program Pack_StackOverflow
    use M_ArraySection
    implicit none
    real*8, allocatable :: b2D ( :, : )
        do k = 8, 14 !works up to 15
            print*, "Start calc for k=", k, 2**k
            call ArraySection ( b2D (2:nx-1, 2:ny-1) )
            print*, "End calc for k=", k
        enddo
end program Pack_StackOverflow
module M_ArraySection
    use M_Array_2D_to_1D
contains
    subroutine ArraySection ( b2D )
        real*8, intent(in) :: b2D ( : , : )
        print*, "-call ArraySection start"
        call Array_2D_to_1D ( size(b2D), b2D )
        print*, "-call ArraySection finish"
    end subroutine ArraySection
end module M_ArraySection
module M_Array_2D_to_1D
contains
    subroutine Array_2D_to_1D ( asize, b1D )
        integer, intent(in) :: asize
        real*8, intent(in) :: b1D (asize)
        real*8, allocatable :: loc1D ( : )
        print*, "--call Array_2D_to_1D start"
        allocate( loc1D ( asize ), stat = ios)
        loc1D(:) = b1D(:)
        deallocate( loc1D )
        print*, "--call Array_2D_to_1D finish"
    end subroutine Array_2D_to_1D
end module M_Array_2D_to_1D

Start calc for k=           8         256
 -call ArraySection start
forrtl: warning (406): fort: (1): In call to ARRAY_2D_TO_1D, an array temporary was created for argument #2
 --call Array_2D_to_1D start
 --call Array_2D_to_1D finish
 -call ArraySection finish
 Start calc for k=           9         512
 -call ArraySection start
forrtl: severe (170): Program Exception - stack overflow

Is there any way to avoid temporary array creation in this combination?

AND. If I substitute the last function by an explicit interface and use simple do-cycle (see below) - there is neither temporary arryas, nor stack overflow issues.

module M_Array_2D_to_1D
contains
    subroutine Array_2D_to_1D ( asize, b2D )
        integer, intent(in) :: asize
        real*8, intent(in) :: b2D (:,:)
        real*8, allocatable :: loc1D ( : )
        print*, "--call Array_2D_to_1D start"
        allocate( loc1D ( asize ), stat = ios)
            do j = 1, ny
                do i = 1, nx
                    ind = ( j - 1 ) * nx + i
                    loc1D ( ind ) = b2D ( i, j )
                enddo
            enddo
        !loc1D(:) = pack(b2D, .true.) ! stack overflow  
        deallocate( loc1D )
        print*, "--call Array_2D_to_1D finish"
    end subroutine Array_2D_to_1D
end module M_Array_2D_to_1D

...
 Start calc for k=          14       16384
 -call ArraySection start
 --call Array_2D_to_1D start
 --call Array_2D_to_1D finish
 -call ArraySection finish
 End calc for k=          14
 Program end

0 Kudos
andrew_4619
Honored Contributor II
747 Views

2:nx-1, 2:ny-1 does not represent a continuous chunk of memory so it is not surprising that a temp is created. What is the problem with the do loop versions? It does the job and i can't imagine is slows things in any significant way. Lets face it the compiler must do something very similar when it extract the temp copy but with some extra memory operations added in..... On another note the optimiser could chose to ignore a big chunk of M_Array_2D_to_1D (the loop version) as the loops have no output, you allocate populate and dealocate the array without "using" it, if I was benchmarking such a loop I would print a or save a random element from the array to be 100% sure.

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
747 Views

>> Pass a section of 2D array to subA

This wasn't stated in your original problem statement.

When using this technique, the sections you pass must be contiguous:

call YourSubroutine(Array2D(:, iRowBegin:iRowEnd), size(Array2D(:, iRowBegin:iRowEnd)))

or

call YourSubroutine(Array3D(:, :, iZBegin:iZEnd), size(Array3D(:, :, iZBegin:iZEnd)))

Any other way will require a copy of the data (as it will not be contiguous).

** structure your data and calls such that you can compute contiguous sections of the array **

Jim Dempsey

0 Kudos
Reply