Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted

Copy small arrays many times speed difference

Jump to solution

In my program I need to copy small arrays (e.g. 20 elements) many times back and forth (obviously changing values in-between), and I've found that this takes a substantial amount of the program time. To understand what is happening I made a small test program. I compile it with both ifort (18.0.1) and gfortran (7.3.0), and ifort seems to be taking significantly more time (9.3 s versus 1.8 s). Running with static arrays results in 3.8 s with ifort and 0.0s with gfortran, the latter probably figured out that I'm doing nothing... The testing was conducted by running the script given below (I'm running on a linux cluster, hence the module loading and removing (purge) commands to have equal testing conditions and loading the newest compilers.

Are there any way of improving the performance under ifort (The CPU is Intel Haswells so should be in intel's favour). Please note if some specific output information is required.

program main
implicit none
    integer, parameter              :: nvar=20
    integer                         :: clock_count, clock_rate, k1
    double precision, allocatable   :: var_old(:), var_new(:)
   !double precision                :: var_old(nvar), var_new(nvar)
    double precision                :: time
    
    allocate(var_old(nvar), var_new(nvar))
    
    var_old = 1.d0
    var_new = 0.d0
    
    call system_clock ( clock_count, clock_rate)
    time = real(clock_count)/real(clock_rate)
    
    
    do k1=1,int(5e8)
        var_new=var_old
        var_old=var_new
    enddo
    
    call system_clock(clock_count, clock_rate)
    time = real(clock_count)/real(clock_rate) - time
    
    write(*,*) time
    
end program
module purge
module load foss # Loading gnu toolchain
gfortran -O3 speedtest.f90 -o speed_gfortran
echo "Running speed test with gfortran:"
./speed_gfortran
module purge
module load intel #Loading intel toolchain
ifort -O3 speedtest.f90 -o speed_ifort
echo "Running speed test with ifort:"
./speed_ifort
echo "Test complete"

 

0 Kudos

Accepted Solutions
Highlighted
Black Belt

As you note, a compiler can

Jump to solution

As you note, a compiler can decide to remove the copying since you never use it. I note that ifort doesn't do so, skewing the results. You'll get better ifort results by adding -xHost , so it can use more advanced copying instructions, but your comparison is likely not valid for a real-world program.

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

View solution in original post

0 Kudos
2 Replies
Highlighted
Black Belt
1 View

As you note, a compiler can

Jump to solution

As you note, a compiler can decide to remove the copying since you never use it. I note that ifort doesn't do so, skewing the results. You'll get better ifort results by adding -xHost , so it can use more advanced copying instructions, but your comparison is likely not valid for a real-world program.

Steve (aka "Doctor Fortran") - https://stevelionel.com/drfortran

View solution in original post

0 Kudos
Highlighted

Thank you for your reply

Jump to solution

Thank you for your reply Steve, the -xHost did improve the speed and gfortran was doing fancy stuff to avoid doing what I wanted to test it seems like. I updated the test by adding a call to a separately compiled function, resulting in 15s for gfortran and 9.1s/7.0s for ifort (without and with xHost)

New loop:

    do k1=1,int(5e8)
        call dummy(var_old, var_new)
        var_old=var_new
    enddo

New module containing dummy:

module test
implicit none

contains

subroutine dummy(var_old, var_new)
implicit none
double precision, intent(inout):: var_old(:), var_new(:)

var_new = var_old + 1.d0

end subroutine

end module

And new test script:

module purge
module load foss
gfortran -O3 -c dummy.f90
gfortran -O3 -c speedtest.f90
gfortran dummy.o speedtest.o -o speed_gfortran
#gfortran -O3 -g speedtest.f90 -o speed_gfortran
echo "Running speed test with gfortran:"
./speed_gfortran
module purge
module load intel
ifort -O3 -c -xHost dummy.f90
ifort -O3 -c -xHost speedtest.f90
ifort dummy.o speedtest.o -o speed_ifort
#ifort -O3 -g speedtest.f90 -o speed_ifort
echo "Running speed test with ifort:"
./speed_ifort
echo "Test complete"

 

0 Kudos