I have a Fortran-question on Multi-Dimensional Allocatable Arrays
I'd like to declare an array of dynamic length, where each entry is itself an array of length 2. It should look like
REAL, Allocatable:: XY(2,:)
Unfortunately: If I do so, I get an error
error #6644: The dimension specifications are incompatible. [XY]
I could solve this by declaring both dimensions dynamic:
REAL, Allocatable:: XY(:,:)
But if I do so, I expect the compiler to generate less efficient code, since the compiler does know less about the structure of my array at compile time.
In particular: The expression "XY(1,i)" should generate code like "Multiply i with 2*sizeof(REAL)" which is an efficient power-of-2-multiplication.
Fortran doesn't have the feature you want. A deferred-shape array must have all of its bounds deferred, you can't pick and choose. Why not make it a single-dimensioned array of a two-component derived type?
Thanks for the clearification.
> Why not make it a single-dimensioned array of a two-component derived type?
I have to refactor some code - this refactoring seemed to be easier with the other approach. I will use array of a two-component type.
>> I will use array of a two-component type.
This may require extensive modification of your program
XY(i,j) becomes XY(j)%v(i)
I suggest you allocate XY(:,:), but then code your program as
allocate XY(2,N) ... call foo(YY,N) ! array XY, extent ... subroutine foo(QQ, extentQQ) real :: QQ(2, extentQQ) integer :: extentQQ .. do i=1, extentQQ QQ(1,i) = ... QQ(2,i) = ... ...
In this manner you have all the benefits of any potential compiler hints you can offer.
And more importantly, fewer changes to your code.
> This may require extensive modification of your program
You are right. But I wrote some Python-code to do this rewriting. So I'm be fine with it.
Does anybody have an opinion, whether my expectation "'2. Option Type' generates faster code since the compiler knows about the size of the type" is true?
1. Option Array:
REAL, Allocatable:: XY(:,:) allocate XY(2,N) XY(1,I) = 123
2. Option Type:
type Coord REAL X REAL Y end type Coord Coord, Allocatable:: XY(:) XY(I)%x = 123
The more the compiler can see at compile time, the better, though the two cases you show should generate very similar code since the offset of memory location is known either way. The first case does require the multiplication for the index to look in the descriptor for the extent size, but I doubt that is measurable.
I assume option 2, line 5 has a typo (:,:) should be (:)
Option 1) Steve is correct about the multiplication, I might add though:
When the compilation unit does not see (know) the first index is 2, then the generated code will have to fetch the size of the first dimension from the array descriptor (as what Steve says), and added: if the register pressure of your code is small then the extent size will likely remain in a register, and in which case your performance difference would be negligible.
Option 2) because the type for X and Y in type Coord are REAL, thus the sizeof(Coord) is 8. The instruction set of the CPU has a prefix called SIB (Scale Index Base). The Scale portion can perform a multiplication of 1, 2, 4, or 8. Due to Coord being sizeof 8, the compiler can eliminate the multiplication instruction. This can potentially be significant if the array is contained in L1 cache, less so if it is in L2 and lesser if it is in L3 and possibly negligible when data is in RAM. Also, the use of SIB to perform the multiplication can recover a GP register and by doing so may help performance too.
Note, using the DUMMY argument method, where the first dimension is explicitly stated as 2, will experience the benefit of the SIB multiplication (when Coord contains two REAL*4's)
Many thanks to Steve and Jim. Great support!
In my real code the allocation and the usage of the array are in different functions - so I doubt the compiler can guess the size of the inner array.
I'll stick with the two-component type ...
Benedikt R. wrote:
.. Does anybody have an opinion, whether my expectation "'2. Option Type' generates faster code since the compiler knows about the size of the type" is true? ..
Can you please provide some background/description of the computations you will perform with either Option Array/Type? There is the possibility the generated code shows little difference in terms of being fast with the two options for some of the computational needs. Why not then use the option that makes the code more readable and maintainable for the coder(s) working on it, both now as well as in the future?
>>Can you please provide some background/description of the computations you will perform with either Option Array/Type?
It is always necessary to look at the larger picture as opposed to the hottest inner most loop. Depending on the larger picture it may (or may not) be beneficial to separate X and Y into different arrays: X(i), Y(i) as opposed to XY(i)%X, XY(i)%Y
While the user defined type XY reduces the number of arguments on a CALL/function, that time is trivial compared to using the data from the array/arrays.
I would be remiss if I didn't add my standard observation that guessing at micro-optimizations is a waste of effort. Write the code in a way that makes the most sense and is the most maintainable, and let the compiler worry about optimization. Run the program through a performance analyzer such as Intel VTune and see if there are problem spots. In most cases, your energy is best spent elsewhere in the program.