Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29283 Discussions

question regarding passing arrays to subroutines and optimization

burnesrr
Beginner
698 Views
I have been exploring ways to speed up a Fortran 90 fluid mechanics code I have written and have found that the way that a subset of a three-dimensional array is passed to a subroutine can have a large effect on how long it takes to run the code.

The level of optimization consists of using the "-fast" option during compile.

One of the subroutine calls I make is to compute a derivative using a 6th-order explicit scheme. Originally, the subroutine call looked like this:

SUBROUTINE compute_deriv
.
.
.
DO i=1,nx
CALL deriv(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
.
.
.
END SUBROUTINE compute_deriv

Using some calls to collect timing information, I find it takes 474 milliseconds to complete the subroutine called "COMPUTE_DERIV".

However, when I change the code to look like:

DO i=1,nx
ftmp = fs(i,:,k)
CALL deriv(f=ftmp,df=dftmp)
dfs(i,:,k) = dftmp
ENDDO

I find that the subroutine "COMPUTE_DERIV" completes in 231 milliseconds.

Note that fs and dfs are arrays of size (nx,ny,nz) and f, df, ftmp and dftmp are arrays of size (ny)

I am hoping someone out there can help me understand why specifying a subset of the array in the subroutine call results in much longer run-times when using the "-fast" level of optimization?
0 Kudos
8 Replies
Steven_L_Intel1
Employee
698 Views
In the first case, the compiler needs to do copy-in/copy-out of both arguments since you're passing a non-contiguous slice and the compiler does not know if either argument is read before writing. In the second case you're doing one copy in and one copy out, saving a set of two copies.

You might improve things for the first case by providing an explicit interface that has INTENT(IN) for the first argument and INTENT(OUT) for the second. It might even be better to declare the corresponding arguments in "deriv" as DIMENSION(:,:,:) which, along with an explicit interface, avoids the copies altogether, though it might slow down the subroutine - you'd have to test this.
0 Kudos
burnesrr
Beginner
698 Views
Hi Steve,

Thankyou for the thoughts regarding this issue. I do in-fact specify intent for all the parameters going into and out of the subroutines. Here are the additional details:

SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs

INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp

DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO

RETURN
END SUBROUTINE d_new
!********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(nx),INTENT(IN) :: f
REAL(dp),DIMENSION(nx),INTENT(INOUT) :: df
.
.
.

Does this change your view of what is happening?

Sincerely,
Rick
0 Kudos
mecej4
Honored Contributor III
698 Views
In Fortran, declarations have a scope that is limited to the subprogram (or module, but you don't have any in what you posted) in which they occur. Therefore, INTENT declarations in subroutine deriv_new are not visible in the caller, d_new.

Either provide an explicit interface for deriv_new to the caller, or put deriv_new in a module and USE the module in the caller.
0 Kudos
burnesrr
Beginner
698 Views
Thankyou for the responses. I have included these routines within a module and the timing information provided in my initial post is what I am seeing when this is the case.

I have a module called params, which contains all the relevant variable definitions. Then I have a separate module called functions, which contains the subroutine calls posted above and there is a "USE params" statement at the top of the module called functions.

I believe this is what you are stating - any more thoughts on what is happening?

Sincerely,
Rick
0 Kudos
mecej4
Honored Contributor III
698 Views
If the called function bodies are CONTAINed in a module and the routine with the caller contains a USE for that module, then things will be OK.

It is not clear from your description whether this is indeed so.
0 Kudos
burnesrr
Beginner
698 Views
Here is the module params in the file params.f90:

MODULE params
USE kinds
IMPLICIT NONE

REAL(dp) :: pi

INTEGER,PARAMETER :: nx = 128
INTEGER,PARAMETER :: ny = 128
INTEGER,PARAMETER :: nz = 128

REAL(dp) :: lengthscale

INTEGER,PARAMETER :: debugunit = 200
CHARACTER(LEN=80) :: debugfile='debug.txt'

REAL(dp),DIMENSION(:),ALLOCATABLE :: xold,yold,zold
REAL(dp),DIMENSION(:),ALLOCATABLE :: xnew,ynew,znew
INTEGER,DIMENSION(:),ALLOCATABLE :: nstart,nend
REAL(dp),DIMENSION(:,:),ALLOCATABLE :: f_old, df_old
REAL(dp),DIMENSION(:,:,:),ALLOCATABLE :: f_new, df_new

!******************************************************************

TYPE :: centered_scheme
REAL(dp) :: a,b,c,d
END TYPE centered_scheme

END MODULE params


AND here is a portion of the module functions in the file functions.f90:

MODULE functions
USE params
IMPLICIT NONE

! DATE_AND_TIME variables. DATE_AND_TIME is a standard fortran90 routine.
INTEGER,DIMENSION(8) :: last_time = 0 ! last wall clock time
LOGICAL :: first_time_call = .TRUE.

TYPE (centered_scheme), PRIVATE :: &
exp6_1 = centered_scheme( 3._dp/4._dp, -3._dp/20._dp, 1._dp/60._dp, 0._dp ), &
exp6_2 = centered_scheme( -49._dp/18._dp, 3._dp/2._dp, -3._dp/20._dp, 1._dp/90._dp )

CONTAINS
!**********************************************************
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs

INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp

DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO

RETURN
END SUBROUTINE d_new
!**********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(ny),INTENT(IN) :: f
REAL(dp),DIMENSION(ny),INTENT(INOUT) :: df
.
.
.
END SUBROUTINE deriv_new
.
.
.
END MODULE functions

So my question is, how would placing the "USE params" at the top of the module functions be different than placing a separate "USE params" within each subroutine?

Sincerely,
Rick
0 Kudos
mecej4
Honored Contributor III
698 Views
> So my question is, how would placing the "USE params" at the top of the 
module functions be different than placing a separate "USE params" within each subroutine?


If the variable declarations in Module functions use some of the constants
declared in Module params, you definitely need a "USE params" at the module
level. It is probably a good idea to restrict what is used by adding an
ONLY : ... clause.

The bad outcome of putting in "USE params" at the top of Module functions is
that you expose a number of variables to mistaken overwriting in the contained
routines of that module. Suppose you had a third routine in addition to the two
that you displayed, and it did not need any of the module variables. If you had
a local variable in the routine for which you forgot to give a declaration in the
routine, the module variable would be used and could be written to, possibly
an unwanted side-effect.
0 Kudos
burnesrr
Beginner
698 Views
Hi mecej4,

Thankyou for the input. I am still interested in understanding how this organization of the modules and the way I am calling the subroutines results in the time discrepancies. Please refer to the original post - I am very interested in understanding how specifying the subset of an array in a subroutine call results in such a large increase in the time required to complete the call.

Sincerely,
Rick
0 Kudos
Reply