Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28995 Discussions

Which solution is the most efficient?

OP1
New Contributor III
258 Views
I have a large code that uses derived types very intensively. Allocatable variables of such derived types are declared in modules, as below:
[fortran]MODULE MYMOD
IMPLICIT NONE
INTEGER,PARAMETER :: DP = 8
TYPE T_TEST
    REAL(KIND=DP),ALLOCATABLE :: U(:)
END TYPE T_TEST
TYPE(T_TEST),ALLOCATABLE :: TEST(:)
END MODULE MYMOD[/fortran]
When coding subroutines that access the content of TEST, I have two options. Here is the first one:
[fortran]SUBROUTINE OPTION_1_SUB(U)
USE MYMOD,ONLY: DP
IMPLICIT NONE
REAL(KIND=DP),ALLOCATABLE,INTENT(IN) :: U(:)
! Do some stuff with U here...
END SUBROUTINE[/fortran]
[fortran]PROGRAM OPTION_1_PROG
USE MYMOD
IMPLICIT NONE
INTEGER :: I
! Allocate TEST and do some stuff...
DO I=1,SIZE(TEST)
    CALL OPTION_1_SUB(TEST(I)%U)
END DO
END PROGRAM OPTION_1_PROG[/fortran]
Or, as a second option:
[fortran]SUBROUTINE OPTION_2_SUB(I)
USE MYMOD
IMPLICIT NONE
INTEGER,INTENT(IN) :: I
! Do some stuff with TEST(I)%U here.
END SUBROUTINE OPTION_2_SUB[/fortran]
[fortran]PROGRAM OPTION_2_PROG
USE MYMOD
IMPLICIT NONE
INTEGER :: I
! Allocate TEST and do some stuff...
DO I=1,SIZE(TEST)
    CALL OPTION_2_SUB(I)
END DO
END PROGRAM OPTION_2_PROG[/fortran]
In the actual code, there would be many array components in TEST; and the operations on such array components would rely heavily on accessing their values by their position indices (no operations on whole sections of the arrays).

It seems to me that passing all component arrays to the subroutine OPTION_1_SUB would be more efficient (less overhead) than manipulating directly these as in OPTION_2_SUB (especially since in the actual code TEST has a more complicated structure and OPTION_2_SUB would require accessing data locations such as A(i)%B(j)%C(K)). With option 1 the base address of each component array  (or actually, the base address of the array descriptor) would be passed to the subroutine.

Is my assumption correct?

Olivier
0 Kudos
2 Replies
John_Campbell
New Contributor II
258 Views

Oliver,

 There is a 3rd option, which is to use an old F77 style approach, of having one large real (kind=dp) vector and an integer vector index_test_i(num_test), which points to the position in the real vector where test “i” information starts.

Being an old programmer, I am suspicious of the overheads in using derived type structures for numerical intensive calculations, as I prefer derived types more for data tables. I’d certainly be interested to find out if this is not an issue.

I note that for your ‘Option_1_Sub” you are addressing the data as a real vector, although there is the management of it’s allocatable status, while Option_2_Sub uses all the overhead of managing multiple allocatable arrays and derived type structures. Again I am showing my suspicion that this overhead will cost efficiency.

Certainly the derived type makes it easier to support more complex data structures, while the F77 style approach would try to limit the problem definition to a few sequential lists for each test. Your reference to "A(i)%B(j)%C(k)" certainly implies you are not thinking of a simple list structure for your input into each test! I'd recommend you spend some time thinking about the best way to describe the problem definition.

If you are in the project development stage and don’t yet know the final data structure the tests will require, Option_2_Sub would be easier to manage as the approach is developed.

John

0 Kudos
OP1
New Contributor III
258 Views
John,

Yes, this third option exists - but I don't want to think about the nightmare this would entail as far as indexing/bookkeeping is concerned, given the amount of data (of different types, lengths, nested derived types etc.) :)
A fourth option would be to have type-bound procedures that manipulate directly the derived types data, but here I am concerned by the performance impact due to overhead. It seems to me that option 1 is a good compromise between the need for structured data and the need for efficiency, but I could be wrong of course (the alleged benefits of option 1 would be nullified if temporaries for the array components are created before being passed to the subroutine, for instance).
More experienced Fortran gurus may be able to weigh in on this, probably.

Olivier
0 Kudos
Reply