Solved: Thank you very very much Jim!

Heo__Jun-Yeong · ‎03-08-2017

Hello, I'm a student studying the geophysical exploration simulation (like modeling the elastic wave equation in FDM, FEM, etc).

I've known that when the simple situation like below,

real(8) :: a(100)
call test(a)
     :
subroutine test(b)
real(8) :: b(:)

in fortran, they allocate new memories in size of 'b' and the values of 'a' are copied to 'b'. After the subroutine is over, the values of 'b' are copied to 'a' and the memories of 'b' are deallocated.
But, what if the size of array 'a' is very large? Allocating new memory in size of 'a' can be memory consuming.

I recently found the pointer in fortran and I wonder about the memory usage of fortran. For example,

integer, pointer :: p1
real(8), pointer :: p2

Do the memory requirements of 'p1' and 'p2' are different? or the same?
Does the memory requirement of pointer is smaller than other types of variables (like real, complex, etc)?
Will it be of help to use subroutines which pass large arrays?

Thank you

jimdempseyatthecove · ‎03-10-2017

>>But, you mean in that case, fortran does not allocate new memory for 'b', just passing the reference of the 'a' array descriptor.

Allocation or not, use of array descriptor (new or reused) or use of reference to first cell all depend on how you program and the implied (required) code generation of the compiler.

When the caller of the subroutine is compiled .AND. the interface is supplied (e.g. in module or direct specification) then the compiler knows the capability requirements of the called code and can conditionalize (optimize) the code generated to make the call. When the compiler does not know the calling convention, then it must assume a worst case (IOW F77 calling convention). Note, if the subroutine dummy arguments use (:), or variations thereof, then it is a requirement to have the interface specified at the time the compiler generates the callers code.

Consider this:

! Interface .NOT. known
! caller
real :: array(100)
...
call UsingF77convention(50, array(1:100:2)) ! pass every other element to subroutine

In the above case, the compiler knows nothing of the called subroutine. Therefore a temporary contiguous copy of the selected cells of the array are created and then first cell passed by reference. Upon return, the potentially modified data is distributed back into the original array at every other cell location.

call UsingF77convention(50, array(1:50)) ! 1st 50 elements to subroutine
call UsingF77convention(50, array(51:100)) ! last 50 elements to subroutine

In both cases (and where array is known to be contiguous as it is described above), then the compiler passes the reference to the 1st or 51st cell as the case may be. IOW no temporary copy.

When the interface is known, then you code may not necessarily need the size argument (i.e. when the subroutine uses the (:) on the dummy). As to if a copy gets made or not is a lot more complicated, and too numerous to state here (as well as subject to implementation and requirements to meet standards).

When interface uses : for dummy then generally, when the caller specifies a known contiguous array or slice thereof, either the original array descriptor (whole array) or new temporary array descriptor (specifying contiguous slice) reference can be passed without making a temporary copy of the data.

When the interface uses either fixed dimension(s) .OR. specified integer dimension(s) passed as dummy on call, then it is a requirement that the array passed be contiguous (*note). So when the caller specifies a contiguous array or array slice then no copy is made, just the reference to the base cell of called array/array slice is passed. When it is not contiguous, then, depending upon if you specified as calling convention for the array as INTENT(IN), INTENT(OUT), INTENT(INOUT), or no intention at all, then the copy operation may be made on call (IN), on return (OUT), both ways (INOUT) or not specified. Note, not specified permits passing in undefined data.

*Note Inter-procedural Optimizations may inline the subroutine and thus (may or may not) eliminate the need for temporary copy and/or temporary array descriptor.

If you are concerned about performance (as this thread indicates), then it would be beneficial for you to experiment with generating test code with various permutations. Then using the debugger with full optimizations, display the disassembly code. To aid in this, place a PRINT * in front of the CALL. With full optimization enabled, the debugger usually can locate the PRINT lines. An alternative is to place a call to subroutine of your own that the compiler cannot see for optimization (used a seperate .obj).

Jim Dempsey

View solution in original post

jimdempseyatthecove · ‎03-09-2017

wrong>>in fortran, they allocate new memories in size of 'b' and the values of 'a' are copied to 'b'. After the subroutine is over, the values of 'b' are copied to 'a' and the memories of 'b' are deallocated.

In your code sample #1, your subroutine test is required to be called from a source that contains an interface to test due to b(:). The interface will be implicitly created should subroutine test reside in a module (and the calling source USEs the module).

If the subroutine test is NOT in a module, then either the calling source needs to specify the interface .OR. USE a module that contains just the interface (but not necessarily the code).

When subroutine test is called (assuming caller has access to the interface), the reference to the array a(100) array descriptor is passed (i.e. pointer to array descriptor). In C++ parlay this is like a reference to a container (container being the array descriptor).

*** If your calling program does NOT have access to the proper interface of subroutine test, then the SOP is to assume F77 calling conventions and in which case the base address of the array (location of a(1) in this case is passed).
*** subroutine test will crash or produce trash results or non-results or contaminate data due to it assuming (requiring) the address passed is that of an array descriptor.

Don't use pointers unless you absolutely must. Use the array descriptors.

Last note, in some cases the data will be copied because it has to be copied. Such as passing a column of a 2D array to an F77 calling convention subroutine that required the data to be contiguous. Or you pass non-stride 1 slice of the array.

Jim Demspey

Heo__Jun-Yeong · ‎03-09-2017

Thanks for the great answer.
And sorry for skipping most part of the code.
I normally write scripts like below(and I'm mainly using fortran 90):

program Main

    implicit none
    integer :: n = 10
    real(8) :: a(n)

    a(3) = 1.d0
    call test(n, a)

end program Main

subroutine test(n, b)
  
    implicit none
    integer, intent(in) :: n
    real(8) :: b(n)

    b(:) = b(:) + 1

end subroutine test

When I first studied about fortran and its subroutine, I understood that fortran allocates new memory for 'b' and values of 'a' are copied to 'b'.
And when the subroutine is over, the values of 'b' are copied to 'a' and memories of 'b' are deallocated.
But, you mean in that case, fortran does not allocate new memory for 'b', just passing the reference of the 'a' array descriptor.
Then does that mean the memory location of 'a' and 'b' are the same? and the changing values of 'b' in subroutine immediately reflect to 'a'?

Jun-yeong Heo

jimdempseyatthecove · ‎03-10-2017