Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28964 Discussions

Copy small arrays many times speed difference


In my program I need to copy small arrays (e.g. 20 elements) many times back and forth (obviously changing values in-between), and I've found that this takes a substantial amount of the program time. To understand what is happening I made a small test program. I compile it with both ifort (18.0.1) and gfortran (7.3.0), and ifort seems to be taking significantly more time (9.3 s versus 1.8 s). Running with static arrays results in 3.8 s with ifort and 0.0s with gfortran, the latter probably figured out that I'm doing nothing... The testing was conducted by running the script given below (I'm running on a linux cluster, hence the module loading and removing (purge) commands to have equal testing conditions and loading the newest compilers.

Are there any way of improving the performance under ifort (The CPU is Intel Haswells so should be in intel's favour). Please note if some specific output information is required.

program main
implicit none
    integer, parameter              :: nvar=20
    integer                         :: clock_count, clock_rate, k1
    double precision, allocatable   :: var_old(:), var_new(:)
   !double precision                :: var_old(nvar), var_new(nvar)
    double precision                :: time
    allocate(var_old(nvar), var_new(nvar))
    var_old = 1.d0
    var_new = 0.d0
    call system_clock ( clock_count, clock_rate)
    time = real(clock_count)/real(clock_rate)
    do k1=1,int(5e8)
    call system_clock(clock_count, clock_rate)
    time = real(clock_count)/real(clock_rate) - time
    write(*,*) time
end program
module purge
module load foss # Loading gnu toolchain
gfortran -O3 speedtest.f90 -o speed_gfortran
echo "Running speed test with gfortran:"
module purge
module load intel #Loading intel toolchain
ifort -O3 speedtest.f90 -o speed_ifort
echo "Running speed test with ifort:"
echo "Test complete"


0 Kudos
1 Solution
Honored Contributor III

As you note, a compiler can decide to remove the copying since you never use it. I note that ifort doesn't do so, skewing the results. You'll get better ifort results by adding -xHost , so it can use more advanced copying instructions, but your comparison is likely not valid for a real-world program.

View solution in original post

0 Kudos
2 Replies
Honored Contributor III

As you note, a compiler can decide to remove the copying since you never use it. I note that ifort doesn't do so, skewing the results. You'll get better ifort results by adding -xHost , so it can use more advanced copying instructions, but your comparison is likely not valid for a real-world program.

0 Kudos

Thank you for your reply Steve, the -xHost did improve the speed and gfortran was doing fancy stuff to avoid doing what I wanted to test it seems like. I updated the test by adding a call to a separately compiled function, resulting in 15s for gfortran and 9.1s/7.0s for ifort (without and with xHost)

New loop:

    do k1=1,int(5e8)
        call dummy(var_old, var_new)

New module containing dummy:

module test
implicit none


subroutine dummy(var_old, var_new)
implicit none
double precision, intent(inout):: var_old(:), var_new(:)

var_new = var_old + 1.d0

end subroutine

end module

And new test script:

module purge
module load foss
gfortran -O3 -c dummy.f90
gfortran -O3 -c speedtest.f90
gfortran dummy.o speedtest.o -o speed_gfortran
#gfortran -O3 -g speedtest.f90 -o speed_gfortran
echo "Running speed test with gfortran:"
module purge
module load intel
ifort -O3 -c -xHost dummy.f90
ifort -O3 -c -xHost speedtest.f90
ifort dummy.o speedtest.o -o speed_ifort
#ifort -O3 -g speedtest.f90 -o speed_ifort
echo "Running speed test with ifort:"
echo "Test complete"


0 Kudos