Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29285 Discussions

more efficiente: passing vars to subroutine or having them in modules

rreis
New Contributor I
958 Views

well, thats it, whats more efficient?
0 Kudos
10 Replies
TimP
Honored Contributor III
958 Views
I guess you're expecting us to infer a question from your title.
I will suppose that you may wish to compare assumed size arrays and module arrays.
For vectorizable code, the assumed size arrays require the compiler to allow for all combinations of alignments. When vectorizing a loop with up to 3 assumed size arrays, Intel compiler will make multiple code versions so as to use more aligned operations in the case where that is possible, producing extreme code bloat. If there is a larger number of assumed size arrays, the compiler won't attempt to align more than one of them. Run time could then be up to double the optimum, on a Core 2 processor, or, possibly, on the future AVX. On the current Core i7, this penalty normally is small. On any CPU, the overhead for adjustments may destroy the advantage of vectorization on short loops, less than length 16 or so.
If the arrays are correctly aligned so that no alignment adjustment is needed, you would require the VECTOR ALIGNED directive to take advantage of it for assumed size arrays.
The compiler should attempt to align module arrays, so that it could take advantage of alignments automatically.

On the NetBurst CPUs, with their high sensitivity to write combine buffering, pushing more than 7 arguments on the stack for a subroutine call could incur a significant additional penalty for partial buffer flushes, much more than the time it would take to push arguments when no stalls occur.
These possible performance problems with dummy arguments in subroutine calls are one of the major reasons for compilers to offer interprocedural optimization. When that is succesful, it could eliminate these performance differences.

Libraries such as netlib BLAS (or proprietary versions like Intel MKL) have no reasonable choice other than to use assumed size arrays.
The decision between module and assumed size arrays would normally be made from program design and maintainability considerations.
0 Kudos
rreis
New Contributor I
958 Views
thanks for the ensight. I was wondering the penalty of passing stuff through the subroutine interface and having it on a module. I think I'll got with the module for big arrays then.
0 Kudos
Steven_L_Intel1
Employee
958 Views
I'll comment that the compiler has to be more conservative about module variables if your procedure calls other procedures as it must assume that module variables, like COMMON, may change over the call. With arguments, it can assume that they won't change if the argument isn't directly passed on to the other procedure.

I'm a big proponent of readable code and eliminating unnecessary use of global variables. Globals can also hurt you if you add threading to your application. Adding to what Tim said, just the passing of a variable as an argument isn't significant for performance, but the declaration of that argument can make a difference. The more compile-time information is known, the better the code. Deferred bounds and unkown array sizes can complicate code.
0 Kudos
rreis
New Contributor I
958 Views
OK. I see now what I should have asked is what are the different mechanisms that subroutines use for accesing passed arguments and variables in modules.

So,

subroutine(a,b,nx,ny,nz)

implicit none

integer :: nx,ny,nz
real, dimension(nx,ny,nz) :: a,b

is preferable to

subroutine(a,b)

implicit none

real, dimension(:,:,:) :: a,b


and the first version prefarable to putting a,b in a module... ?
0 Kudos
Steven_L_Intel1
Employee
958 Views
You're still comparing apples and oranges. What would the declaration of A and B be in the module? Constant dimensions or assumed-shape? If constant dimensions, then you could define the dimensions as PARAMETER constants in a module USEd by the subroutine and not have to pass them and make A and B into adjustable arrays.

Do what makes sense for the application. You're micro-managing things here. If the application is simpler using assumed-shape arrays, then use them.
0 Kudos
rreis
New Contributor I
958 Views
You're still comparing apples and oranges. What would the declaration of A and B be in the module? Constant dimensions or assumed-shape? If constant dimensions, then you could define the dimensions as PARAMETER constants in a module USEd by the subroutine and not have to pass them and make A and B into adjustable arrays.

Do what makes sense for the application. You're micro-managing things here. If the application is simpler using assumed-shape arrays, then use them.

sorry for not being clear. they would be allocatable arrays so one can use diferent grid sizes without needing to recompile. but I see that for this case I should think better in code reability and modularity. this is a DNS code originally written in F77 I want to port to F95/2003. I have already paralleized it with MPI but I feel I'm stuck restrained (fixed size arrays, difficult use to integrate with libraries (FFT are subroutines)...). I wan't to be more free so I'm rewritting it and off course I'm thinking how to better structure it for use in a parallel enviromment and how it could be more readable for someone else trying to expand it or just adding new numerical stuff.
0 Kudos
clabra
New Contributor I
958 Views
Maybe I'm wrong, but can be better (or most clear) if you pass the arrays A and B as arguments and the sizes are defined as variables in a module?
I'm ask because I'm making the same kind of re-engineering with my code ....

subroutine foo(A,B)
use sizes, only: nx, ny, nz
real, dimension(nx,ny,nz) :: A, B
.....

Can be more readable and avoid the excess of arguments.

clabra
0 Kudos
rreis
New Contributor I
958 Views
Quoting - clabra
Maybe I'm wrong, but can be better (or most clear) if you pass the arrays A and B as arguments and the sizes are defined as variables in a module?
I'm ask because I'm making the same kind of re-engineering with my code ....

subroutine foo(A,B)
use sizes, only: nx, ny, nz
real, dimension(nx,ny,nz) :: A, B
.....

Can be more readable and avoid the excess of arguments.

clabra

I know that if interfaces are generated for subroutines (having them in modules), you don't need to explicit say the size of them. I mean, this would work

subroutine(a,b)

real,dimension(:,:,:) :: A,B

I wonder, thought, if there is any penalty from this (or is if it is even meaninfull). Using this approach I can retrieve the dimensions (for loop limits, for instance) by

nx=size(A,1)

wonder whats the best approach for performance, readbility, modularity...

0 Kudos
clabra
New Contributor I
958 Views
I understand it is better the explicit definition of the size, in the performance point of view.
When you pass an assumed-shape array, the real argument is the array descriptor, not the array itself. At less I understood thats after read the compiler documentation.
Maybe Steve can clarify this topic.

clabra
0 Kudos
Steven_L_Intel1
Employee
958 Views
It's when the called routine declares the dummy argument as assumed-shape (explicit interface required) that a descriptor is passed. Otherwise, just the address of the array is passed. (Exception: If the passed array is not contiguous, the address of a copy may be passed).

On the receiving side, there may not be a lot of difference between an assumed-shape array (descriptor) and an adjustable array (bounds passed as arguments or in COMMON or module vars.) as the bounds need to be found at run-time.

Again, I think this effort is better spent making the program maintainable and correct. If you feel there is a performance problem, use a performance analyzer such as Intel VTune and look for "hot spots". If you try to apply intuition as to where performance may be suffering, you'll usually be wrong.
0 Kudos
Reply