- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have been exploring ways to speed up a Fortran 90 fluid mechanics code I have written and have found that the way that a subset of a three-dimensional array is passed to a subroutine can have a large effect on how long it takes to run the code.
The level of optimization consists of using the "-fast" option during compile.
One of the subroutine calls I make is to compute a derivative using a 6th-order explicit scheme. Originally, the subroutine call looked like this:
SUBROUTINE compute_deriv
.
.
.
DO i=1,nx
CALL deriv(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
.
.
.
END SUBROUTINE compute_deriv
Using some calls to collect timing information, I find it takes 474 milliseconds to complete the subroutine called "COMPUTE_DERIV".
However, when I change the code to look like:
DO i=1,nx
ftmp = fs(i,:,k)
CALL deriv(f=ftmp,df=dftmp)
dfs(i,:,k) = dftmp
ENDDO
I find that the subroutine "COMPUTE_DERIV" completes in 231 milliseconds.
Note that fs and dfs are arrays of size (nx,ny,nz) and f, df, ftmp and dftmp are arrays of size (ny)
I am hoping someone out there can help me understand why specifying a subset of the array in the subroutine call results in much longer run-times when using the "-fast" level of optimization?
The level of optimization consists of using the "-fast" option during compile.
One of the subroutine calls I make is to compute a derivative using a 6th-order explicit scheme. Originally, the subroutine call looked like this:
SUBROUTINE compute_deriv
.
.
.
DO i=1,nx
CALL deriv(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
.
.
.
END SUBROUTINE compute_deriv
Using some calls to collect timing information, I find it takes 474 milliseconds to complete the subroutine called "COMPUTE_DERIV".
However, when I change the code to look like:
DO i=1,nx
ftmp = fs(i,:,k)
CALL deriv(f=ftmp,df=dftmp)
dfs(i,:,k) = dftmp
ENDDO
I find that the subroutine "COMPUTE_DERIV" completes in 231 milliseconds.
Note that fs and dfs are arrays of size (nx,ny,nz) and f, df, ftmp and dftmp are arrays of size (ny)
I am hoping someone out there can help me understand why specifying a subset of the array in the subroutine call results in much longer run-times when using the "-fast" level of optimization?
Link Copied
8 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the first case, the compiler needs to do copy-in/copy-out of both arguments since you're passing a non-contiguous slice and the compiler does not know if either argument is read before writing. In the second case you're doing one copy in and one copy out, saving a set of two copies.
You might improve things for the first case by providing an explicit interface that has INTENT(IN) for the first argument and INTENT(OUT) for the second. It might even be better to declare the corresponding arguments in "deriv" as DIMENSION(:,:,:) which, along with an explicit interface, avoids the copies altogether, though it might slow down the subroutine - you'd have to test this.
You might improve things for the first case by providing an explicit interface that has INTENT(IN) for the first argument and INTENT(OUT) for the second. It might even be better to declare the corresponding arguments in "deriv" as DIMENSION(:,:,:) which, along with an explicit interface, avoids the copies altogether, though it might slow down the subroutine - you'd have to test this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Steve,
Thankyou for the thoughts regarding this issue. I do in-fact specify intent for all the parameters going into and out of the subroutines. Here are the additional details:
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs
INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp
DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO
RETURN
END SUBROUTINE d_new
!********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(nx),INTENT(IN) :: f
REAL(dp),DIMENSION(nx),INTENT(INOUT) :: df
.
.
.
Does this change your view of what is happening?
Sincerely,
Rick
Thankyou for the thoughts regarding this issue. I do in-fact specify intent for all the parameters going into and out of the subroutines. Here are the additional details:
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs
INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp
DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO
RETURN
END SUBROUTINE d_new
!********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(nx),INTENT(IN) :: f
REAL(dp),DIMENSION(nx),INTENT(INOUT) :: df
.
.
.
Does this change your view of what is happening?
Sincerely,
Rick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In Fortran, declarations have a scope that is limited to the subprogram (or module, but you don't have any in what you posted) in which they occur. Therefore, INTENT declarations in subroutine deriv_new are not visible in the caller, d_new.
Either provide an explicit interface for deriv_new to the caller, or put deriv_new in a module and USE the module in the caller.
Either provide an explicit interface for deriv_new to the caller, or put deriv_new in a module and USE the module in the caller.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thankyou for the responses. I have included these routines within a module and the timing information provided in my initial post is what I am seeing when this is the case.
I have a module called params, which contains all the relevant variable definitions. Then I have a separate module called functions, which contains the subroutine calls posted above and there is a "USE params" statement at the top of the module called functions.
I believe this is what you are stating - any more thoughts on what is happening?
Sincerely,
Rick
I have a module called params, which contains all the relevant variable definitions. Then I have a separate module called functions, which contains the subroutine calls posted above and there is a "USE params" statement at the top of the module called functions.
I believe this is what you are stating - any more thoughts on what is happening?
Sincerely,
Rick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If the called function bodies are CONTAINed in a module and the routine with the caller contains a USE for that module, then things will be OK.
It is not clear from your description whether this is indeed so.
It is not clear from your description whether this is indeed so.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is the module params in the file params.f90:
MODULE params
USE kinds
IMPLICIT NONE
REAL(dp) :: pi
INTEGER,PARAMETER :: nx = 128
INTEGER,PARAMETER :: ny = 128
INTEGER,PARAMETER :: nz = 128
REAL(dp) :: lengthscale
INTEGER,PARAMETER :: debugunit = 200
CHARACTER(LEN=80) :: debugfile='debug.txt'
REAL(dp),DIMENSION(:),ALLOCATABLE :: xold,yold,zold
REAL(dp),DIMENSION(:),ALLOCATABLE :: xnew,ynew,znew
INTEGER,DIMENSION(:),ALLOCATABLE :: nstart,nend
REAL(dp),DIMENSION(:,:),ALLOCATABLE :: f_old, df_old
REAL(dp),DIMENSION(:,:,:),ALLOCATABLE :: f_new, df_new
!******************************************************************
TYPE :: centered_scheme
REAL(dp) :: a,b,c,d
END TYPE centered_scheme
END MODULE params
AND here is a portion of the module functions in the file functions.f90:
MODULE functions
USE params
IMPLICIT NONE
! DATE_AND_TIME variables. DATE_AND_TIME is a standard fortran90 routine.
INTEGER,DIMENSION(8) :: last_time = 0 ! last wall clock time
LOGICAL :: first_time_call = .TRUE.
TYPE (centered_scheme), PRIVATE :: &
exp6_1 = centered_scheme( 3._dp/4._dp, -3._dp/20._dp, 1._dp/60._dp, 0._dp ), &
exp6_2 = centered_scheme( -49._dp/18._dp, 3._dp/2._dp, -3._dp/20._dp, 1._dp/90._dp )
CONTAINS
!**********************************************************
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs
INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp
DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO
RETURN
END SUBROUTINE d_new
!**********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(ny),INTENT(IN) :: f
REAL(dp),DIMENSION(ny),INTENT(INOUT) :: df
.
.
.
END SUBROUTINE deriv_new
.
.
.
END MODULE functions
So my question is, how would placing the "USE params" at the top of the module functions be different than placing a separate "USE params" within each subroutine?
Sincerely,
Rick
MODULE params
USE kinds
IMPLICIT NONE
REAL(dp) :: pi
INTEGER,PARAMETER :: nx = 128
INTEGER,PARAMETER :: ny = 128
INTEGER,PARAMETER :: nz = 128
REAL(dp) :: lengthscale
INTEGER,PARAMETER :: debugunit = 200
CHARACTER(LEN=80) :: debugfile='debug.txt'
REAL(dp),DIMENSION(:),ALLOCATABLE :: xold,yold,zold
REAL(dp),DIMENSION(:),ALLOCATABLE :: xnew,ynew,znew
INTEGER,DIMENSION(:),ALLOCATABLE :: nstart,nend
REAL(dp),DIMENSION(:,:),ALLOCATABLE :: f_old, df_old
REAL(dp),DIMENSION(:,:,:),ALLOCATABLE :: f_new, df_new
!******************************************************************
TYPE :: centered_scheme
REAL(dp) :: a,b,c,d
END TYPE centered_scheme
END MODULE params
AND here is a portion of the module functions in the file functions.f90:
MODULE functions
USE params
IMPLICIT NONE
! DATE_AND_TIME variables. DATE_AND_TIME is a standard fortran90 routine.
INTEGER,DIMENSION(8) :: last_time = 0 ! last wall clock time
LOGICAL :: first_time_call = .TRUE.
TYPE (centered_scheme), PRIVATE :: &
exp6_1 = centered_scheme( 3._dp/4._dp, -3._dp/20._dp, 1._dp/60._dp, 0._dp ), &
exp6_2 = centered_scheme( -49._dp/18._dp, 3._dp/2._dp, -3._dp/20._dp, 1._dp/90._dp )
CONTAINS
!**********************************************************
SUBROUTINE d_new(fs,dfs)
IMPLICIT NONE
REAL(dp),DIMENSION(:,:,:),INTENT(IN) :: fs
REAL(dp),DIMENSION(:,:,:),INTENT(INOUT) :: dfs
INTEGER :: i,j,k
REAL(dp),DIMENSION(ny) :: ftmp,dftmp
DO k=1,nz
DO i=1,nx
! ftmp = fs(i,:,k)
! CALL deriv_new(f=ftmp,df=dftmp)
! dfs(i,:,k) = dftmp
call deriv_new(f=fs(i,:,k),df=dfs(i,:,k))
ENDDO
ENDDO
RETURN
END SUBROUTINE d_new
!**********************************************************
SUBROUTINE deriv_new(f,df)
IMPLICIT NONE
REAL(dp),DIMENSION(ny),INTENT(IN) :: f
REAL(dp),DIMENSION(ny),INTENT(INOUT) :: df
.
.
.
END SUBROUTINE deriv_new
.
.
.
END MODULE functions
So my question is, how would placing the "USE params" at the top of the module functions be different than placing a separate "USE params" within each subroutine?
Sincerely,
Rick
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
> So my question is, how would placing the "USE params" at the top of the
module functions be different than placing a separate "USE params" within each subroutine?
If the variable declarations in Module functions use some of the constants
declared in Module params, you definitely need a "USE params" at the module
level. It is probably a good idea to restrict what is used by adding an
ONLY : ... clause.
The bad outcome of putting in "USE params" at the top of Module functions is
that you expose a number of variables to mistaken overwriting in the contained
routines of that module. Suppose you had a third routine in addition to the two
that you displayed, and it did not need any of the module variables. If you had
a local variable in the routine for which you forgot to give a declaration in the
routine, the module variable would be used and could be written to, possibly
an unwanted side-effect.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi mecej4,
Thankyou for the input. I am still interested in understanding how this organization of the modules and the way I am calling the subroutines results in the time discrepancies. Please refer to the original post - I am very interested in understanding how specifying the subset of an array in a subroutine call results in such a large increase in the time required to complete the call.
Sincerely,
Rick
Thankyou for the input. I am still interested in understanding how this organization of the modules and the way I am calling the subroutines results in the time discrepancies. Please refer to the original post - I am very interested in understanding how specifying the subset of an array in a subroutine call results in such a large increase in the time required to complete the call.
Sincerely,
Rick

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page