Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29358 Discussions

Why is dynamically allocated storage slower than static storage with ifort?

ajboeckmannyahoo_com
1,599 Views

My application uses arrays that are very variable in size. I would like
to use dynamically allocated storage, but the run times are always at
least 20% worse than with static arrays. This has been observed with
ifort on Windows and also Unix/Linux/MacOSX, with and without
optimization.

Is there any ifort compiler option that will enable ALLOCATED arrays to
be accessed as fast as static arrays?

Is there any way to allocate the storage dynamically that will not
affect run time?

The following program is designed to run on Unix/Linux/MacOSX. The
constant SIZE can be changed to a larger or smaller value if the run
times are too small or too large for a given processor.


c255b% cat staticdynamic.f90
PROGRAM STATICDYN
REAL, ALLOCATABLE:: DYN(:)
INTEGER, PARAMETER:: SIZE=10000000
REAL STATIC(SIZE)
ALLOCATE (DYN(SIZE))
DYN=0.; STATIC=0;
CALL SYSTEM('/bin/date')
DO I=1,100; STATIC=STATIC+1.; ENDDO
CALL SYSTEM('/bin/date')
DO I=1,100; DYN=DYN+1.; ENDDO
CALL SYSTEM('/bin/date')
END

0 Kudos
3 Replies
Ron_Green
Moderator
1,599 Views

Read my response #4 in this thread:

http://software.intel.com/en-us/forums/showthread.php?t=70229

the gist of it: optimization figures out you don't use the data in the static case and removes the dead code, whereas with allocatable data it assumes that if you allocated the data you don't want the apparently dead code removed and it still executes the do-nothing loop.

To answer the question: either data allocation method is fine, there is no difference in real world code execution times. Use whatever method is most straightforward. Keep in mind that static data is limited to 2GB in many operating systems even if the OS is 64bits. Thus, I recommend allocatable data.

ron

0 Kudos
ajboeckmannyahoo_com
1,599 Views
Thanks, Ron.
But I do see a difference in real world execution time.
With default optimization:
static 2 secs.
dynamic 16 secs.
With -g (no optimization):
static 31 secs
dynamic 49 secs.
How can this be?
-------

venice% ifort staticdyn.f90
staticdyn.f90(6): (col. 7) remark: LOOP WAS VECTORIZED.
staticdyn.f90(8): (col. 37) remark: PERMUTED LOOP WAS VECTORIZED.
staticdyn.f90(10): (col. 19) remark: LOOP WAS VECTORIZED.
venice% a.out
Fri Jan 15 13:35:28 PST 2010
Fri Jan 15 13:35:30 PST 2010
Fri Jan 15 13:35:46 PST 2010
venice% ifort -g staticdyn.f90
venice% a.out
Fri Jan 15 13:36:09 PST 2010
Fri Jan 15 13:36:40 PST 2010
Fri Jan 15 13:37:29 PST 2010

venice% ifort -V
Intel Fortran Compiler Professional for applications running on Intel 64, Version 11.0 Build 20090318 Package ID: m_cprof_p_11.0.064
Copyright (C) 1985-2009 Intel Corporation. All rights reserved.

venice% uname -a
Darwin ariel2.ucsf.edu 9.8.0 Darwin Kernel Version 9.8.0: Wed Jul 15 16:55:01 PDT 2009; root:xnu-1228.15.4~1/RELEASE_I386 i386

venice% cat staticdyn.f90
program staticdyn
real, allocatable:: dyn(:)
integer, parameter:: k=100000000
real static(k)
allocate (dyn(k))
dyn=0.; static=0;
call system('/bin/date')
do i=1,100; static=static+1.; enddo
call system('/bin/date')
do i=1,100; dyn=dyn+1.; enddo
call system('/bin/date')
end
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,599 Views

Using dynamically allocated storage requires the compiler to generate and use the information contained within an array descriptor.

When using statically allocated storage the compiler can generate code that bypasses the array descriptor.

The problem with statically allocated arrays is the 2GB limitation. Depending on your application you may need to assess the issues of speed vs size of static data.

When coding using dynamically allocated arrays you have the freedom of specifying how you wish to pass the arrays into a subroutine

[bash]subroutine foo(a)
  real :: a(:)
  ! slower access to a(i)
end subroutine foo

subroutine foo(a, s)
  integer :: s
  real :: a(s)
  ! faster access to a(i)
end subroutine foo
[/bash]

The IVF documentation addresses this issue in greater detail.

Jim Dempsey

0 Kudos
Reply