question regarding passing arrays to subroutines and optimization

burnesrr · ‎10-06-2011

I have been exploring ways to speed up a Fortran 90 fluid mechanics code I have written and have found that the way that a subset of a three-dimensional array is passed to a subroutine can have a large effect on how long it takes to run the code.

The level of optimization consists of using the "-fast" option during compile.

One of the subroutine calls I make is to compute a derivative using a 6th-order explicit scheme. Originally, the subroutine call looked like this:

SUBROUTINE compute_deriv
.
.
.
DO i=1,nx
CALL deriv(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
.
.
.
END SUBROUTINE compute_deriv

Using some calls to collect timing information, I find it takes 474 milliseconds to complete the subroutine called "COMPUTE_DERIV".

However, when I change the code to look like:

DO i=1,nx
ftmp = fs(i,:,k)
CALL deriv(f=ftmp,df=dftmp)
dfs(i,:,k) = dftmp
ENDDO

I find that the subroutine "COMPUTE_DERIV" completes in 231 milliseconds.

Note that fs and dfs are arrays of size (nx,ny,nz) and f, df, ftmp and dftmp are arrays of size (ny)

I am hoping someone out there can help me understand why specifying a subset of the array in the subroutine call results in much longer run-times when using the "-fast" level of optimization?

Steven_L_Intel1 · ‎10-06-2011

In the first case, the compiler needs to do copy-in/copy-out of both arguments since you're passing a non-contiguous slice and the compiler does not know if either argument is read before writing. In the second case you're doing one copy in and one copy out, saving a set of two copies.

You might improve things for the first case by providing an explicit interface that has INTENT(IN) for the first argument and INTENT(OUT) for the second. It might even be better to declare the corresponding arguments in "deriv" as DIMENSION(:,:,:) which, along with an explicit interface, avoids the copies altogether, though it might slow down the subroutine - you'd have to test this.

burnesrr · ‎10-06-2011

Hi Steve,

Thankyou for the thoughts regarding this issue. I do in-fact specify intent for all the parameters going into and out of the subroutines. Here are the additional details:

SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs

INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp

DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO

RETURN
END SUBROUTINE d_new
!********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(nx),INTENT(IN) :: f
REAL(dp),DIMENSION(nx),INTENT(INOUT) :: df
.
.
.

Does this change your view of what is happening?

Sincerely,
Rick

mecej4 · ‎10-06-2011

In Fortran, declarations have a scope that is limited to the subprogram (or module, but you don't have any in what you posted) in which they occur. Therefore, INTENT declarations in subroutine deriv_new are not visible in the caller, d_new.

Either provide an explicit interface for deriv_new to the caller, or put deriv_new in a module and USE the module in the caller.

burnesrr · ‎10-08-2011

Thankyou for the responses. I have included these routines within a module and the timing information provided in my initial post is what I am seeing when this is the case.

I have a module called params, which contains all the relevant variable definitions. Then I have a separate module called functions, which contains the subroutine calls posted above and there is a "USE params" statement at the top of the module called functions.

I believe this is what you are stating - any more thoughts on what is happening?

Sincerely,
Rick

mecej4 · ‎10-08-2011

If the called function bodies are CONTAINed in a module and the routine with the caller contains a USE for that module, then things will be OK.

It is not clear from your description whether this is indeed so.

burnesrr · ‎10-08-2011

Here is the module params in the file params.f90:

MODULE params
USE kinds
IMPLICIT NONE

REAL(dp) :: pi

INTEGER,PARAMETER :: nx = 128
INTEGER,PARAMETER :: ny = 128
INTEGER,PARAMETER :: nz = 128

REAL(dp) :: lengthscale

INTEGER,PARAMETER :: debugunit = 200
CHARACTER(LEN=80) :: debugfile='debug.txt'

REAL(dp),DIMENSION(:),ALLOCATABLE :: xold,yold,zold
REAL(dp),DIMENSION(:),ALLOCATABLE :: xnew,ynew,znew
INTEGER,DIMENSION(:),ALLOCATABLE :: nstart,nend
REAL(dp),DIMENSION(:,:),ALLOCATABLE :: f_old, df_old
REAL(dp),DIMENSION(:,:,:),ALLOCATABLE :: f_new, df_new

!******************************************************************

TYPE :: centered_scheme
REAL(dp) :: a,b,c,d
END TYPE centered_scheme

END MODULE params

AND here is a portion of the module functions in the file functions.f90:

MODULE functions
USE params
IMPLICIT NONE

! DATE_AND_TIME variables. DATE_AND_TIME is a standard fortran90 routine.
INTEGER,DIMENSION(8) :: last_time = 0 ! last wall clock time
LOGICAL :: first_time_call = .TRUE.

TYPE (centered_scheme), PRIVATE :: &
exp6_1 = centered_scheme( 3._dp/4._dp, -3._dp/20._dp, 1._dp/60._dp, 0._dp ), &
exp6_2 = centered_scheme( -49._dp/18._dp, 3._dp/2._dp, -3._dp/20._dp, 1._dp/90._dp )

CONTAINS
!**********************************************************
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs

INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp

DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO

RETURN
END SUBROUTINE d_new
!**********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(ny),INTENT(IN) :: f
REAL(dp),DIMENSION(ny),INTENT(INOUT) :: df
.
.
.
END SUBROUTINE deriv_new
.
.
.
END MODULE functions

So my question is, how would placing the "USE params" at the top of the module functions be different than placing a separate "USE params" within each subroutine?

Sincerely,
Rick

mecej4 · ‎10-08-2011

> So my question is, how would placing the "USE params" at the top of the 
module functions be different than placing a separate "USE params" 
within each subroutine?

If the variable declarations in Module functions  use some of the constants 
declared in Module params, you definitely need a "USE params" at the module 
level. It is probably a good idea to restrict what is used by adding an 
ONLY : ... clause.

The bad outcome of putting in "USE params" at the top of Module functions is 
that you expose a number of variables to mistaken overwriting in the contained 
routines of that module. Suppose you had a third routine in addition to the two 
that you displayed, and it did not need any of the module variables. If you had 
a local variable in the routine for which you forgot to give a declaration in the 
routine, the module variable would be used and could be written to, possibly
an unwanted side-effect.

burnesrr · ‎10-08-2011

Hi mecej4,

Thankyou for the input. I am still interested in understanding how this organization of the modules and the way I am calling the subroutines results in the time discrepancies. Please refer to the original post - I am very interested in understanding how specifying the subset of an array in a subroutine call results in such a large increase in the time required to complete the call.

Sincerely,
Rick