Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development Tools (Compilers, Debuggers, Profilers & Analyzers)
- Intel® Fortran Compiler
- Speed up calculations of derived typs with declared module procedures

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

MariusR

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-09-2020
08:53 AM

655 Views

Speed up calculations of derived typs with declared module procedures

Hello everybody!

my name is Marius and I am an absolutely newcomer to programming with Fortran and all the stuff accompanying this doing. Due to this, most of the terms are not familiar to me and maybe I have to ask twice to get it. Most of the things I learned by trial and error and by reading some text books written explicit for beginners.

At the moment I am dealing with the implementation of dual numbers / hyper numbers to do automatic differentiation. For this purpose, I have made a module containing:

1. the derived type for the dual numbers, for instance:

TYPE,PUBLIC:: DUAL_NUMREAL(8)::x_rlREAL(8)::x_h1ENDTYPEDUAL_NUM

2. the necessary procedures for extending the basic arithmetic’s, for instance addition of two dual numbers:

PUBLICOPERATOR(+)INTERFACEOPERATOR(+)MODULEPROCEDUREADD_DNENDINTERFACE

CONTAINSELEMENTALFUNCTIONADD_DN(u,v)RESULT(res)TYPE(DUAL_NUM),INTENT(IN)::u,vTYPE(DUAL_NUM)::res res%x_rl = u%x_rl+v%x_rl res%x_h1 = u%x_di+v%x_h1ENDFUNCTIONADD_DN

I am able to use this module in a main file and the automatic differentiation does great. To get the second derivative, I used numeric differentiation by means of a central difference quotient.

To avoid this, I want to use hyper-dual numbers to get the second partial derivative in the very same manner with analytic precision. Instead of declaring 2 parts (one real and one "dual" part), these hyper-dual numbers requires four parts. The derived type and the procedure for addition reads:

TYPE,PUBLIC:: HYPER_NUMREAL(8)::x_rlREAL(8)::x_h1REAL(8)::x_h2REAL(8)::x_h3

ENDTYPEHYPER_NUMPUBLICOPERATOR(+)INTERFACEOPERATOR(+)

MODULEPROCEDUREADD_DNENDINTERFACE

CONTAINSELEMENTALFUNCTIONADD_DN(u,v)RESULT(res)TYPE(DUAL_NUM),INTENT(IN)::u,vTYPE(DUAL_NUM)::res

res%x_rl = u%x_rl+v%x_rl res%x_h1 = u%x_h1+v%x_h1

res%x_h2 = u%x_h2+v%x_h2

res%x_h3 = u%x_h3+v%x_h3ENDFUNCTIONADD_DN

Although this also does great, I am encountered with some serious loss of performance by comparing to the use of dual numbers and numeric differentiation (computation time nearly doubles). I am aware that the use of hyper-dual numbers requires a multiple number of floating-point operations (minimum twice the amount compared to dual numbers in the case of an addition), but to do numeric differentiation by means of a central difference quotient, I need to evaluate the function twice. Due to this, the total amount of operations is not so different.

I think the reason for this behavior is that the "standard" operations using generic Fortran functions (addition and multiplication operations) are much faster compared to the module procedures I explained.

One attempt do get around this problem was do place the operations within the lexical extent of an openMP directive-pair (!$OMP parallel sections, !$OMP section, .... , !$OMP end parallel sections), but this does the opposite and actually slows down the whole algorithm. I have to say that I know even less about openMP than I do about Fortran :), but I think all the operations with dual numbers/ hyper-dual numbers can be parallelized.

I would be very thankful if someone could give me some hints or advices to optimize the implementation of dual numbers/ hyper-dual numbers.

greetings

Marius

1 Solution

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-11-2020
07:07 AM

560 Views

In your sample code, you have explicitly declared the operator functions as ELEMENTAL. IOW, your intention is to use these functions in an elemental manner (on arrays of your type).

I suggest instead you declare a TYPE that is a collection of these objects, where each property of each object resides in an array. IOW the collection contains an array of x_rl, x_di, ...

Then your operator functions operate on the collections.

If you attribute the collection (TYPE) with TARGET, then, if you desire, you may be able to create a getter/putter that presents a scalar representation of your current type (at a given index into the collection).

Something along the line of this (untested) code:

```
TYPE DUAL_NUM_COLLECTION
REAL(8), ALLOCATABLE :: x_rl(:), x_di(:)
CONTAINS
FUNCTION ALLOC(N) RESULT(R)
INTEGER :: N, R
ALLOCATE( x_rl(:), x_di(:), STAT=R)
END FUNCTION ALLOC
INTERFACE OPERATOR(*)
FUNCTION DUAL_NUM_COLLECTION_MUL(A, B) RESULT(R)
TYPE(DUAL_NUM_COLLECTION), INTENT(IN) :: A, B
TYPE(DUAL_NUM_COLLECTION) :: R
END FUNCTION DUAL_NUM_COLLECTION_MUL
END INTERFACE
...
FUNCTION DUAL_NUM_COLLECTION_MUL(A, B) RESULT(R)
TYPE(DUAL_NUM_COLLECTION), INTENT(IN) :: A, B
TYPE(DUAL_NUM_COLLECTION) :: R
R%x_rl = A%x_rl * B%x_rl
R%x_di = A%x_di * B%x_rl + A%x_rl * B%x_di
END FUNCTION DUAL_NUM_COLLECTION_MUL
```

You would have to add DUAL_NUM_COLLECTION member functions GET and PUT that fetch and store an individual type of DUAL_NUM (think of this as gather and scatter).

Jim Dempsey

Link Copied

11 Replies

FortranFan

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-09-2020
11:48 AM

636 Views

A few suggestions:

- Instead of words such as "I am encountered with some serious loss of performance by comparing to the use of dual numbers and numeric differentiation (computation time nearly doubles)," you will likely receive more valuable and actionable feedback if you can provide some code illustrations of what exactly you're doing with your numerical differentiation computations and comparisons to determine the loss of performance. Even better will be if you can provide a working example, a small reproducer that hints at the issue even if it's nowhere as exact as your actual case which may have constraints in terms of what you can share in the public domain.
- If you're working with arrays of your data, if possible consider a design which is "Structure of Arrays" (SoA). Based on your original post, you may be working with Arrays of Structures (AoS). See
**this**. - Your interest appears more generic and pertaining to Fortran and as such, you may want to inquire also at the new Fortran discourse site (https://fortran-lang.discourse.group/). Also to consider will be comp.lang.fortran usenet newsgroup.

MariusR

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-09-2020
12:14 PM

631 Views

Thank you for your reply.

I will work through your points. Roughly speaking, it's about to implement a thermodynamic model (PC-SAFT and its modifications) as an excel add -in. To get the desired results, this model must be derived with respect to different variables, which is an elaborate procedure, in particular with regard to obtain double derivations. I hope that I can give an illustrative and representative example in the next days

greetings

Marius

FortranFan

Honored Contributor I

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-09-2020
04:19 PM

610 Views

@MariusR ,

For your discussions in the public domain, you can instead consider working with a simpler formulation, say the original van der Waals equation-of-state that can be worked out trivially and analytically for the typical thermodynamic properties of interest, and you can put together your Fortran code with such a model.

You can start with the Helmholtz' free energy relationship in terms of the natural variables of the equation-of-state i.e., temperature, molar volume/density, and mole numbers. All the properties of interest can then be determined by differentiation, either analytically or numerically. Such a prototype can give you insight into code design with other equation-of-state models.

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-10-2020
07:14 AM

590 Views

Too little information is provided to offer any of us any hope of making suggestions.

The example you show, only show operator (+).

Do you have (*), (/), (.SomeOtherOperator.)?

For simple (+)

```
ENUM, BIND(C)
ENUMERATOR :: x_r1=1,x_h1,x_h2,x_h3,x_next
END ENUM
TYPE, PUBLIC :: HYPER_NUM
REAL(8) :: x(x_next-1)
END TYPE HYPER_NUM
PUBLIC OPERATOR (+)
INTERFACE OPERATOR (+)
MODULE PROCEDURE ADD_DN
END INTERFACE
CONTAINS
ELEMENTAL FUNCTION ADD_DN(u,v) RESULT(res)
TYPE (HYPER_NUM), INTENT(IN)::u,v
TYPE (HYPER_NUM)::res
res%x = u%x+v%x
END FUNCTION ADD_DN
```

*** However, It may be beneficial (performance wise) to shy away from a single type (OOP) as opposed to an aggregation of types.

```
ENUM, BIND(C)
ENUMERATOR :: x_r1=1,x_h1,x_h2,x_h3,x_next
END ENUM
INTEGER, PARAMETER :: HYPER_NUM_SIZE = x_next-1
...
REAL(8), ALLOCATABLE :: COLLECTION_A(:,:) ! either (nObjects, HYPER_NUM_SIZE) or (HYPER_NUM_SIZE, nObjects)
REAL(8), ALLOCATABLE :: COLLECTION_B(:,:) ! either (nObjects, HYPER_NUM_SIZE) or (HYPER_NUM_SIZE, nObjects)
...
! equivilent to operator(+) with both collections
COLLECTION_A = COLLECTION_A + COLLECTION_B
```

*** the order of the subscripts will be dependent upon the algorithms used in your program. One is likely more efficient at vectorization than the other.

Using the dual subscript approach, it is almost trivial to experiment with which order of subscripting is best.

Jim Dempsey

MariusR

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-11-2020
02:22 AM

573 Views

Hello Jim

thank you for your reply. I will try out this kind of data typ (enum), which is new to me. To implement the dual or hyper numbers I used the first best method that I understood => Derived Type. Of course it doesn't have to be the fastest and I'm open for suggestions

And yes, the (*), (/) as well as a lot of other operations have been extended to take the derived type ((**), (log), (exp) , ....). The formula for multiplication and division reads:

ELEMENTALFUNCTIONMULT_DN(u,v)RESULT(res)TYPE(DUAL_NUM),INTENT(IN)::u,vTYPE(DUAL_NUM)::res res%x_rl = u%x_rl*v%x_rl res%x_di= u%x_di*v%x_rl + u%x_rl*v%x_diENDFUNCTIONMULT_DNELEMENTALFUNCTIONDIV_DN(u,v)RESULT(res)TYPE(DUAL_NUM),INTENT(IN)::u,vTYPE(DUAL_NUM)::res res%x_rl = u%x_rl/v%x_rl res%x_di =(u%x_di- res%x_rl*v%x_di)/v%x_rlENDFUNCTIONDIV_DN

The general formula for scalar functions like exp or log is:

f(dual_number)%x_rl=f(dual_number%x_rl) => Function evaluated at the real-part

f(dual_number)%x_di=dual_number%di*f'(dual_number%x_rl) => Derivative of the function evaluated on the real part times the dual_part

At the moment I'm very busy with exams, so please don't think that I'm ignoring your suggestions just because I need some time to answer.

Greetings

Marius

Steve_Lionel

Black Belt Retired Employee

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-11-2020
05:19 AM

566 Views

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-11-2020
07:07 AM

561 Views

In your sample code, you have explicitly declared the operator functions as ELEMENTAL. IOW, your intention is to use these functions in an elemental manner (on arrays of your type).

I suggest instead you declare a TYPE that is a collection of these objects, where each property of each object resides in an array. IOW the collection contains an array of x_rl, x_di, ...

Then your operator functions operate on the collections.

If you attribute the collection (TYPE) with TARGET, then, if you desire, you may be able to create a getter/putter that presents a scalar representation of your current type (at a given index into the collection).

Something along the line of this (untested) code:

```
TYPE DUAL_NUM_COLLECTION
REAL(8), ALLOCATABLE :: x_rl(:), x_di(:)
CONTAINS
FUNCTION ALLOC(N) RESULT(R)
INTEGER :: N, R
ALLOCATE( x_rl(:), x_di(:), STAT=R)
END FUNCTION ALLOC
INTERFACE OPERATOR(*)
FUNCTION DUAL_NUM_COLLECTION_MUL(A, B) RESULT(R)
TYPE(DUAL_NUM_COLLECTION), INTENT(IN) :: A, B
TYPE(DUAL_NUM_COLLECTION) :: R
END FUNCTION DUAL_NUM_COLLECTION_MUL
END INTERFACE
...
FUNCTION DUAL_NUM_COLLECTION_MUL(A, B) RESULT(R)
TYPE(DUAL_NUM_COLLECTION), INTENT(IN) :: A, B
TYPE(DUAL_NUM_COLLECTION) :: R
R%x_rl = A%x_rl * B%x_rl
R%x_di = A%x_di * B%x_rl + A%x_rl * B%x_di
END FUNCTION DUAL_NUM_COLLECTION_MUL
```

You would have to add DUAL_NUM_COLLECTION member functions GET and PUT that fetch and store an individual type of DUAL_NUM (think of this as gather and scatter).

Jim Dempsey

MariusR

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2020
02:25 AM

525 Views

thank you very much for this suggestion! It introduces me to the type-bound-procedures, which seems to be a very nice feature!

To collect the real- and dual parts into arrays instead to do the other way and collect dual_numbers into arrays is a good idea, thank you!

But when I tried to implement this, I ran into some problems. I don't know the rank of my arrays in advance. Is there a way to define a derived type in an assumed-rank-manner? The aim is to do something like this:

```
program test
type(dual_num) :: a
```

! a contains a type-bound-procedure to allocate its sub-parts

call a%alloc(0) ! => a%x_rl and a%x_di will be scalars

call a%alloc(N1,N2) ! => a%x_rl and a%x_di will be arrays with N1xN2 - dimension

end program test

I searched quite a while and didn’t find an answer, except a statement from Steve Lionel: "It's not possible to have an assumed rank derived type component."

I do not want to define several different derived types in order to accomplish this.

Edit.: Maybe you adressed this problem with some of your statements, sry if i didn't get it

jimdempseyatthecove

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-14-2020
07:55 AM

509 Views

Perhaps you can consider using a generic interface:

or

Use CLASS to hold a polymorphic TYPE (to use these requires SELECT TYPE)

*** The title of this thread is:

But what you are asking to do now is hide the details by introducing an abstraction layer. This generally means making the runtime code figure out what to do (select what to do at runtime). And this is counter-productive to your stated goal.

>>* I don't know the rank of my arrays in advance.*

Can you work with a limited number of ranks? Say 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and then define each a type. Then use the generic interface to deal with the abstraction disambiguation.

You can also consider using an allocatable array of size of the desired rank containing a pointer to a type containing an allocatable rank 1 array.

Note, all these indirections can (will?) introduce additional runtime overhead.

Jim Dempsey

Andrew_Smith

New Contributor II

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-16-2020
08:41 AM

471 Views

This technique is well known and often called AD.

There are some commercial libraries available (NAG offer one for Fortran).

I have been using this for many years and have similar code to yours. One modification is to have the differential part as a vector so that you can differentiate with respect to many variables at once. The calculation of the vector term lends itself to vectorization optimsations by the compiler.

I have an outer code that defines the length of the vector, and an include file that defines the "dual math" object and implements all the functions. This allows some flexibility without having to use an allocatable vector which is very slow. I have a few pre-compiled lengths available 1, 2, 3, 4, 6. Then if I need a longer vector I go back to repeated calls.

I have tried using the new parameterized length but dropped it because of too many compiler issues in it early implementation of paramertized derived types. I am not sure what the performance would be like compared to fixed size and allocatable arrays.

MariusR

Novice

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

07-23-2020
08:20 AM

427 Views

Thank you all for your help. The idea to organize the individual types into Arrays, as suggested by Jim, gave a lot of performance back. It took quite a while to implement this, because it changed nearly everything, but it was worth it.

Another mistake i have made and not think about was that most of the time the computations doesn't need to be calculated whit the hyper_numbers. Due to this, a lot of computations were zero computations (0*0 and 0+0). I have now implemented a case differentiation, which helped a lot.

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.