Performance penalty - allocatable arrays within derived data types

nitya · ‎09-17-2012

Hi,

I searched through this forum and was not able to get relevant answers so wanted to post a query.

I have a derived data type that has a number of allocatable arrays within it. My query is under what circumstances is it better to use an array of derived data types when compared to using allocatable arrays within a derived data type. For example, say

TYPE Var1

REAL*8, ALLOCATABLE, DIMENSION(:) :: A

REAL*8, ALLOCATABLE, DIMENSION(:) :: B

END TYPE Var1

Call Calculateresult()

Subroutine Calculateresult()

j = getjValue()

result = Var1%A(j) + Var1%B(j) where the value of j is not sequential.

End Subroutine Calculateresult

-------------------

OR

-------------------

TYPE Var2

REAL*8 :: A

REAL*8 :: B

END TYPE Var2

TYPE(VAR2), ALLOCATABLE, DIMENSION(:) :: VarArray

CALL Calculateresult(VarArray(j))

Subroutine Calculateresult(Var)

result = Var%A + Var%B

End Subroutine Calculateresult

When would it be more efficient to use VarArray and Var2 instead of using Var1. Would the answer be any different, if Var1 and Var2 are themselves part of another derived data type.

If someone could point me to some literature that explains how nested derived data types are stored in memory, and what is an efficient way of using them, it would be great.

Thanks

Nitya

Steven_L_Intel1 · ‎09-19-2012

This is dependent on how your application uses the data, so there is no universal answer. I can tell you that a derived type component that is allocatable consists of a descriptor for the object, the size of which depends on the number of dimensions and whether or not it is polymorphic. Nested derived types simply contain the storage for the derived types in the parent. In the example you show, the second example would be more efficient as there is just one fetch through a descriptor rather than two. If you know the two values, A and B, for a given array index, are going to be used together, then an array of derived type is better than a derived type of arrays.

nitya · ‎09-27-2012

Thanks for your help.

jimdempseyatthecove · ‎09-27-2012

Also, for the example given, in the example 2, A and B are likely to reside in the same cache line. Meaning one memory read loads both variables into L1 cache. Your actual use may change this. However knowing this, should you have a large type (larger than one cache line), it may be beneficial to order the variables in the type to improve probability of cache line hits. This may require non-alphabetically ordered names. Now then, when you have a program that runs sequentially through the indicies, (DO I=1,N) then organizing as the first example may yield better opportunities for vectorization (and faster code). Choose the technique to meet the requirements Jim Dempsey