Solved: Does fortran OOP decrease efficiency?

An_N_1 · ‎09-24-2015

I'm maintaining some simulation code with intel fortran 2015. Now I want to add some OOP characteristic like inherent, polymorphism, allocatable scalar, etc. to make the project more manageable. My project is numeric intensive, so i wonder if OOP will cut down effiency much compared with ancient code style (without types, all data in different arrays in common blocks, as you all know, old f77 codes).

I think OOP will effect efficiency but would not too much, but not very sure. and i want to know if there is some probability, how to avoid?

And except the code management, does module/types have other advantages? for example, easy to be parallel than common blocks?

thanks!

jimdempseyatthecove · ‎09-25-2015

Be careful when going to OOP such that you do not recompose the data objects into the smallest of entities. For example a Point or Particle(point with properties). Doing so will, most often, ruin opportunities for the compiler to vectorize your code. I strongly suggest you devise your OPP structure at a larger scope. Consider constructing a container of Particles, where the particles can be internally stored in a format suitable for vectorization (Structure of Arrays), yet externally appear as Array of Structures. Use member functions on container for container-wide operations, not virtual particle objects.

Jim Dempsey

View solution in original post

Arjen_Markus · ‎09-25-2015

Without knowing more about the details of your program it is difficult to give advice. I will try anyway, but only based on very general principles.

First of all: modules and derived types can definitely be used to make the program easier to maintain, easier to read and understand. I think that is a big plus in all circumstances. One advantage of modules: it allows the compiler to check the argument lists of subroutine calls and such. A big disadvantage of COMMON blocks is that they are merely blocks of memory - you have to make sure that the division into actual variables is consistent.

Second: even if you decide to use OOP features, like inheritance and type-bound procedures, you can always use the direct calls as well. Instead of:

call var%sub( ... )

you can use:

call sub( var, ... )

The advantage would be that in the second case you can use the actual name of the subroutine, avoiding potential run-time overhead for the compiler to determine the right version of the subroutine to use. (This has been reported as a possible bottleneck.)

However, "premature optimisation is the root of all evil" (IIRC, famous maxim by Knuth). In the case of OOP, you are probably better off chosing your objects right than trying to eliminate such potential overheads.

Third: parallellisation is orthogonal to OOP, modules and types or COMMON blocks. The Intel Fortran compiler has some excellent reporting facilities to help with parallellisation. See a recent thread I started regarding the use of DO CONCURRENT.

An_N_1 · ‎09-25-2015

Thank arjenmarkus a lot for the very valueable advice!

as you mentioned:

Second: even if you decide to use OOP features, like inheritance and type-bound procedures, you can always use the direct calls as well. Instead of:

call var%sub( ... )

you can use:

call sub( var, ... )

I have a lot objects with same procedures which is quite suit for oop. The type-bound procedure looks very neat by typename%procedure. With direct call, i seems that inheritance will not make sense?

Thanks again for the great sentence "premature optimisation is the root of all evil"

And i will read your thread about parallellisation.

Arjen_Markus · ‎09-25-2015

You need the form "call var%sub(...)" to take advantage of inheritance, but you could define a generic interface, if your variables are not polymorphic.

By the sound of it, a first step would be to introduce a bunch of more modern features.

John_Campbell · ‎09-25-2015

As you are contemplating changing your F77 simulation code, this gives you a good opportunity to review and redesign your data structures.

I have found the change from lots of variables and arrays in COMMON to derived types in Modules to allow you to apply some hieratical structure to your data. As noted above, having a clearly defined data structure can help with maintenance. Another benefit to the restructure is to be able to expand on the simulation capabilities and improve those areas that were not well addressed in a previous approach. My experience may be a different field to yours, but I have a materials handling simulation package, which has been developed since 1978 and did a major restructure to modules and derived types in 2005, for significant gains. It was well worth the change.

You also need to map out a staged approach, so that you can validate the changes that are being made and not just restructure a model with many years of development in one big step.

As to how far you go to changing the data structures and adopting OOP approaches depends on the benefits you see from this change. You will need to identify operations that relate to the data structures your simulation requires. You may get a variety of views on this but it is worth experimenting to identify what is possible. Try to identify the benefits as you proceed.

Regarding efficiency, there are typically only a few areas of the simulation that are critical to program performance, but for most other areas, clarity and maintainability are more important.

jimdempseyatthecove · ‎09-25-2015

Be careful when going to OOP such that you do not recompose the data objects into the smallest of entities. For example a Point or Particle(point with properties). Doing so will, most often, ruin opportunities for the compiler to vectorize your code. I strongly suggest you devise your OPP structure at a larger scope. Consider constructing a container of Particles, where the particles can be internally stored in a format suitable for vectorization (Structure of Arrays), yet externally appear as Array of Structures. Use member functions on container for container-wide operations, not virtual particle objects.

Jim Dempsey

dboggs · ‎09-25-2015

As an old-timer F77 programmer who is enthusiastic--yet struggling--about adapting at least some of the modern conventions, I too have concerns about efficiency so was excited to see An's original question posted here about speed and efficiency.

Alas, nearly all of the responses address the benefits in maintenance, clarity etc. and not in speed/efficiency. I am vey much aware of some of the benefits of modules, derived data structures, and (especially) the replacement of COMMON, and I find this easy to do with immediate payoffs. But my concerns about speed remain.

While I agree in general with John: "for most other areas, clarity and maintainability are more important" this is certainly not always true. I am keenly aware of a couple of a couple of new codes that were developed then posted for public or semi-public use, they failed to gain acceptance because "they ran too slow." I believe the culprit in both cases was that they were coded in Matlab instead of Fortran. Yes, speed can still be a huge issue, yet people, especially the young generation of programmers, give it second fiddle or even completely ignore it in favor of either "the language they were taught in school" or "a more 'modern' language better suited to doing this or that. Yes in many cases they are right, but I for one are getting a little fed up with the difficulty these days of getting new efficient code past corporate and growing inertia of modern programmers.

I do recognize the use for modern programming in many, many cases but fear that speed is being forgotten about. In my own programs I have adopted only a few of the modern principles. But so far I have stopped short of going the complete OOP route, largely out of fear that it will slow things down. So I would really like to see that issue addressed, sans the lecturing about readability, maintainability, and so on.

Does anybody have any hard-core data on the speed issue, e.g. a large computationally intensive program that was converted from an old-school style to a modern OOP style, for which comparative timings were made?

jimdempseyatthecove · ‎09-25-2015

Consider reading: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=6&cad=rja&uact=8&ved=0CDUQFjAFahUKEwiQttmr7pLIAhWMaT4KHT0OANI&url=https%3A%2F%2Fwww.cs.colostate.edu%2F~cs560%2FFall2015%2FLectures%2FColfaxSlides.pdf&usg=AFQjCNEDAOXRdLV6IEKpiBwvdL9gzlIAaw&bvm=bv.103627116,d.cWw

(or Google n-body soa aos and look down a few lines for the Colfax [PDF])

And on the second page a PDF by Shiller

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=11&cad=rja&uact=8&ved=0CBwQFjAAOApqFQoTCLLvpMLwksgCFUI4Pgod9gMKSA&url=http%3A%2F%2Findico.cern.ch%2Fevent%2F337567%2Fsession%2F1%2Fcontribution%2F46%2Fattachments%2F660801%2F908311%2Ftalk.pdf&usg=AFQjCNGc3Ain8CPeU6V0sllVNI1PRW67UQ&bvm=bv.103627116,d.cWw

Though the above is in C++, you can use the same philosophy in Fortran.

FWIW, the current trend in CPU design is not to increase the clock speed, rather to increase the SIMD register width (small vector instructions). Therefor, the current "future proofing" of your code, indicates coding for vectorization (unless you are not concerned about performance).

Jim Dempsey

An_N_1 · ‎09-26-2015

Thanks for all remarkable discussion!

The simulation program was developed in since 1970s, through NDP fortran, lahey fortran and intel fortran now. Now it still works in a certain area. We even use MPI to accelerate it on PC cluster, and think about trying openmp or CAF, etc. So like dboggs said, performance is also very important.

The reason for adding a little OOP structure is about to get a little modularized, so that is much easier for more people to work on it, especially adding new dynamic models in the simulation. Although, i do know the old style code can do that kind of things well too, but OOP seems more "beautiful and grateful", if it worthes a try, why not? And truly "why not" is my question.

I will not convert all the old codes to OOP at once which will be too much work and unreliable, just to do that to the new models to be added first and then try the rest. just like John said.

Jim's advice about vectorization is just one puzzle for me. For the old code based on arrays, vectorization is direct. If the code packed with class, what will happen about vectorization? For example:

old colde usally likes that :

!For a model:
array1()...
array2()...
do i=1 to n
calculate by array1(i), array2(i), in fact the calculation is in a much more comprehansive mode.
end do

OOP maybe likes that:

module xxx
type::typModel
var1
var2
contains
procedure :: cal
end type
contains
subroutine cal
end subroutine
end module

type(typModel) :: model(n)
do i = 1, to n
model(i)%cal
end do

And is it Jim's so called "smallest of entities" and losts chance to be vecterized?

TimP · ‎09-26-2015

do i = 1, to n
model(i)%cal = ???
end do

is far less likely to vectorize successfully than

do i = 1, to n
model%cal(i) = ???
end do

or

model%cal = ???