Solved: Re: Possible IFX bug: automatic allocation of polymorphic allocatables

Stijn_Schildermans · ‎10-18-2024

It seems that IFX does not properly handle automatic allocation for polymorphic allocatables. See the following code sample:

program test
    implicit none
           
    type, abstract:: a
        integer:: a_data
    end type
    type, extends(a):: t
        integer, dimension(10000):: t_data
    end type

    class(a), dimension(:), allocatable:: t_inst
    t_inst = init()
contains
    function init() result(l)
        class(a), dimension(:), allocatable:: l
        integer:: i

        allocate(t:: l(1000))
    end function
end program

On my system (IFX 2024.2.1.), executing this code results in a segmentation fault. Even changing the declared type of l in the init procedure from class(a) to type(t) does not help.

It seems the space allocated for the variable is based on the declared type of the variable instead of the dynamic type of the expression. To my knowledge the standard instead requires the dynamic type of the variable (and therefore the amount of memory allocated to it) to match the dynamic type of the expression to which it is assigned.

I would like to know if this is indeed a bug or my interpretation of the Fortran standard regarding polymorphic allocatables is wrong.

IanH · ‎10-21-2024

How good the reasons are can be debated, but in the general case the semantics of the language require:

With assignment to a polymorphic allocatable t_inst, the compiler needs to have completely evaluated the expression on the left hand of the assignment statement before it can decide whether t_inst is to be reallocated to match the dynamic type of the expression, or just copied over to an existing allocation. For your example there is no existing allocation so there's no reallocation, but that's not the general case.
After copying the value of the expression on the left hand of the assignment statement to t_inst, the compiler may need to call finalizers on the function result. If the function result is not polymorphic the compiler knows everything about the type of the result (including whether it has finalizers or not), but if the function result is polymorphic then the compiler may not know until runtime whether the result has a finalizer. For your example there are no finalizers anywhere to be seen, but that's not the general case.
The assignment statement may be a defined assignment and not just a simple "copy the bytes". Again, if there are polymorphic variables involved then the compiler may not know whether defined assignment is involved until runtime. For your example there is no defined assignment, but that's not the general case.
The expression on the right hand side of the assignment may involve more than just a simple single function evaluation - e.g. there might be defined operations on the function result prior to the assignment, that perhaps require temporaries in the general case. For your example the compiler could determine the absence of defined operations by simple inspection of the expression, but maybe the compiler was scared by relatively (for Fortran) new language features such as polymorphic allocatable function results and had its eyes closed.

I suspect the presence of the seemingly pointless temporary and the associated stack use is due to the compiler making allowances for these sorts of things, perhaps accommodating some sort of pathological combination. A smarter compiler might do better.

Your changes from polymorphic to non-polymorphic can make it more apparent to the compiler, at compile time, what the compiler may need to accommodate.

View solution in original post

IanH · ‎10-18-2024

You don't provide the command line arguments that you use or operating system, but that function call is returning an object of the order of 40 MB in size. By default, function results and other temporaries are returned on the stack, and 40 MB may be way bigger than the default stack size.

Make sure you are compiling with /heaparrays0 and/or have set the stack size appropriately.

Stijn_Schildermans · ‎10-21-2024

I am on Arch Linux and am not using any options.

Adding heap-arrays does indeed fix the problem. However, if I change the declared type of t_inst and l to type(t) the problem does not occur to begin with. This makes very little sense to me. It seems that IFX realizes that the array is too large to be passed through the stack when you declare the concrete type of the variable the function result is assigned to, but not when you use a polymorphic type for said variable. I see no reason why this distinction should exist.

Furthermore, if I reduce the size of l so that I do not exceed the heap limits Valgrind tells me that both t_inst and l are allocated on the heap. This would then mean that l is allocated on the heap, then copied over to the stack, after which t_inst is allocated on the heap and the return value of init is copied over again from the stack to the heap. This seems wasteful to me. Is there a good reason for this that I am misssing?

IanH · ‎10-21-2024

How good the reasons are can be debated, but in the general case the semantics of the language require:

With assignment to a polymorphic allocatable t_inst, the compiler needs to have completely evaluated the expression on the left hand of the assignment statement before it can decide whether t_inst is to be reallocated to match the dynamic type of the expression, or just copied over to an existing allocation. For your example there is no existing allocation so there's no reallocation, but that's not the general case.
After copying the value of the expression on the left hand of the assignment statement to t_inst, the compiler may need to call finalizers on the function result. If the function result is not polymorphic the compiler knows everything about the type of the result (including whether it has finalizers or not), but if the function result is polymorphic then the compiler may not know until runtime whether the result has a finalizer. For your example there are no finalizers anywhere to be seen, but that's not the general case.
The assignment statement may be a defined assignment and not just a simple "copy the bytes". Again, if there are polymorphic variables involved then the compiler may not know whether defined assignment is involved until runtime. For your example there is no defined assignment, but that's not the general case.
The expression on the right hand side of the assignment may involve more than just a simple single function evaluation - e.g. there might be defined operations on the function result prior to the assignment, that perhaps require temporaries in the general case. For your example the compiler could determine the absence of defined operations by simple inspection of the expression, but maybe the compiler was scared by relatively (for Fortran) new language features such as polymorphic allocatable function results and had its eyes closed.

I suspect the presence of the seemingly pointless temporary and the associated stack use is due to the compiler making allowances for these sorts of things, perhaps accommodating some sort of pathological combination. A smarter compiler might do better.

Your changes from polymorphic to non-polymorphic can make it more apparent to the compiler, at compile time, what the compiler may need to accommodate.

Stijn_Schildermans · ‎10-22-2024

That clears things up quite a bit. Thanks a lot!