Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Overhead using allocate..deallocate

eos_pengwern
Beginner
1,139 Views

Hello,

This is a fairly general question about code efficiency. In my application, there is a loop that reads in some vectors and matrices, works on them, and then moves on to the next iteration.For each iteration of the loop,the vector and matrix sizes are different but they vary from tens of elements to, at most, a few hundred elements.

If I want to write elegant code which is easy to follow, then the best way to handle thisis to find the number of elements at the beginning of the loop, then allocate some local arrays and matrices, then do all the manipulation on these copies, and finally deallocate them at the end of the loop readyto be re-allocated inthe next iteration.

However, it occurs to me that "allocate" & "deallocate" probably involve the operating system in some way and therefore, no doubt, involve a fair amount of overhead. I have an alternative strategy which is just to dimension my local arrays as big as they're ever likely to need be to (say, 500 elements in each dimension) and then work with subranges of them using the colon notation to pick out the elements of interest. The problem with this is that the resulting code is clogged up with array index notation, and looks very ugly and hard to follow.

I could of course just code my application both ways, then run it and see, but it's a fairly big project as it is and I don't really want to double the work! Is there some rule-of-thumb already known that could help me to decide which way to go?

Thanks,

Stephen.

0 Kudos
4 Replies
Steven_L_Intel1
Employee
1,139 Views
The allocate and deallocate involve library calls, but rarely any call into the OS. My first suggestion would be to test the performance both ways and see if the overhead of the allocate/deallocate is noticeable. My guess is that it is not.

An alternative is to use a pointer to an array, and do a single pointer assignment at the beginning of your loop to the array section of your work array. Now the pointer array has a size and you don't have to add indexing.
0 Kudos
eos_pengwern
Beginner
1,139 Views

The pointer approach sounds attractive, since it would be elegant and easy-to-follow from a coding point of view whilst avoiding needless copying of arrays and manipulation of memory. I think I'll make this my default choice, and do some performance comparisons with the other approaches if I have the opportunity.

Thanks very much for the suggestion.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,139 Views

Steve,

Isn't the pointer to an array (slice)functionally equivilent toan assumed shape dummy argument? i.e. the compiler will not know what the base, limit and stride are. As a consequence the optimizer cannot effectively produce optimal code (e.g. vectorization).

On this subject, why is there not an attribute for dummy array arguments and pointers to arrays that restrict the stride to 1 (or a fixed number)? Typically when stride is known to be 1, vectorization improves.

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
1,139 Views
Correct - a pointer to a slice is equivalent to an assumed-shape array.

You are also correct that knowing the stride is 1 would help vectorization. Allocatable arrays do have this advantage.
0 Kudos
Reply