Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29273 Discussions

Performance of array operations for derived type structures

OP1
New Contributor III
446 Views

I have an architectural choice to make for my code and would appreciate any help from the experts on this forum.
I would like to know, of the two following scenarios, which one is the most efficient, or if both are roughly equivalent on a performance point of view.

1. First scenario:

a. Declare an allocatable array A in module M.
b. Manipulate array (MKL FFT functions and other operations) in subroutine S. Access to A from S is through a USE statement of M(A is NOT passed as a dummy argument).

2. Second scenario:

a. Declare a derived type T, which has a component A (same allocatable array as above).
b. Manipulate array T%A (same as above) in subroutine S. Access to T (and therefore, A) from Sis also through a USE statement of M (T is NOT passed as a dummy argument).

My gut feeling says that both should be equivalent - but maybe from the compiler (11.1.070) point of view there is a difference.

Scenario (b) is a lot more attractive as it affords the option to define for each derived type (or object) a private workspace.

Any thoughts on this?

Thanks,
Olivier

0 Kudos
1 Reply
Michael_R_Intel4
Employee
446 Views
Without specific code, it is difficult to measure. However, from a general standpoint, I would say that the second scenario is preferable.

First, you are defining a definite interface to the manipulation of the array, through calling the subroutine. Since placing A in the derived type and within the subroutine (unless I'm misunderstanding you) makes it effectively a local static, this means that the optimizer should be able to better describe what can modify A, and where. This should result in better code, since the optimizer can "see" the changes to A. If it is at module level, potentially it can be changed from a large number of cases, forcing the optimizer to be more conservative.

You don't say whether your application is threaded, and whether there might be simultaneous accesses to the array from different threads. But I would favor the second approach, in case you ever do thread the application, and just from a maintainability standpoint.
0 Kudos
Reply