Solved: allocate a pointer with source of a constructed derived type leads to memory chaos

Roadelse · ‎06-06-2023

I am doing some OOP with unlimited polymorphic Fortran programming. I found a weird runtime error about memory, here is the extracted code:

module m1
implicit none

type :: node
  class(*), pointer :: item => null()
  contains
    final :: destructor
    procedure :: node_print
end type

contains
function node_init(v) result(rst_node)
  class(*), intent(in) :: v
  type(node) :: rst_node

  allocate(rst_node%item, source = v)
end function

subroutine destructor(this)
    implicit none
    type(node), intent(inout) :: this

    if (associated(this%item)) then
      call this%node_print
      call cprint_p(this%item)
      deallocate(this%item)
    end if
end subroutine

subroutine node_print(this)
    implicit none
    ! ............................ Argument
    class(node), intent(in) :: this

    ! ............................ main body
    select type (v => this%item)
        type is (integer)
            print *, v
        type is (real(kind=4))
            print *, v
        type is (real(kind=8))
            print *, v
        type is (logical)
            print *, v
        type is (character(*))
            print *, v
        class default
            stop 1324
    end select

end subroutine

end module

program main
use m1
implicit none

type(node),pointer :: a, b

allocate(a, source=node_init('hello'))
call cprint_p(a%item)

allocate(b, source=node_init('acs'))
call cprint_p(b%item)

call a%node_print

end program

#include<stdio.h>

void cprint_p_(char *p) {
    printf("address for char*p is %p, for itself is %p\n", p, &p);
    return;
}

Here i use C functions to print address for a pointer target and itself. I know it is not a strictly standard method for a unlimited polymorphic variable (compiler will warn), but it seems ok for ifort (in contrast, error in gfortran compilation)

after make and run, the output is :

 hello
address for char*p is 0x15ee380, for itself is 0x7ffffa6079b0
address for char*p is 0x15ee380, for itself is 0x7ffffa607c70
 acs
address for char*p is 0x15ee380, for itself is 0x7ffffa6079b0
address for char*p is 0x15ee380, for itself is 0x7ffffa607c70
 acslo

the "a%node_print" statement prints the "acslo", looks like that the "acs" override first 3 char in "hello".

moreover, it seems that the item pointer in "source=node_init(v)" and "this%item" between a and b point to the same address? (given the method of calling C functions is valid)

I was thinking that, the "allocate" statement with "source" just create another pointer which pointing the same address as in the source variable. Thus, when deallocating the source variable, the allocated node%item is a memory leak. Then, b%item overide the memory content, leading to the results above.

However, if I use gfortran to compile the code (remove the call to cprint_p), the results are all ok, and it will not trigger the final :: node destructor any more.

Also, if i change "pointer" to "allocatable" within type node, it is ok.

Now I wonder did I correctly understand this issue, and is there any official documentation for this feature difference compred to gfortran?

IanH · ‎06-07-2023

Step out what you program is doing, and you will see the problem...

The first allocate statement for `a` is executed. As part of that, the node_init function is called. It allocates the item component of the function result. Then the allocate statement will set the value of a to be the same as the function result - this means that a%item will be associated with the same thing allocated in the node_init function. After the allocate function has been executed, the finalizer (called `destructor` here) is called - it prints some stuff, and then deallocates the thing that the function result item component, and also the a%item copy, is pointing to. a%item is therefore an undefined pointer - the thing it was associated with has GONE.

We then dereference a%item using some weird C function. Uh oh... Clouds form. Demons stir. Dragons awake. This program is in trouble.

We then do all that again, this time with `b`! The demons ride the dragons through the program's address space, causing chaos and confusion.

And then to top it off, we lastly call a%node_print, and dereference undefined a%item again. Nonsense results....

View solution in original post

IanH · ‎06-07-2023

Step out what you program is doing, and you will see the problem...

The first allocate statement for `a` is executed. As part of that, the node_init function is called. It allocates the item component of the function result. Then the allocate statement will set the value of a to be the same as the function result - this means that a%item will be associated with the same thing allocated in the node_init function. After the allocate function has been executed, the finalizer (called `destructor` here) is called - it prints some stuff, and then deallocates the thing that the function result item component, and also the a%item copy, is pointing to. a%item is therefore an undefined pointer - the thing it was associated with has GONE.

We then dereference a%item using some weird C function. Uh oh... Clouds form. Demons stir. Dragons awake. This program is in trouble.

We then do all that again, this time with `b`! The demons ride the dragons through the program's address space, causing chaos and confusion.

And then to top it off, we lastly call a%node_print, and dereference undefined a%item again. Nonsense results....

Roadelse · ‎06-07-2023

Thank you for your reply!

Therefore, my previous guess is close.

Besides, it seems that gfortran will not call the finalizer after allocate statement, which makes the program work (without call the C function)

jdelia · ‎06-07-2023

By the way, I inadvertently compiled a slightly different version of your test with coarrays coarrays (removing the call to cprint_pxxx), and ran it with 2 images but 4 echoes appear on the printout instead of 2 (please, see below). Why would it be?

$ cat /proc/version
Linux version 6.3.5-100.fc37.x86_64 (mockbuild@bkernel02.iad2.fedoraproject.org) (gcc (GCC) 12.3.1 20230508 (Red Hat 12.3.1-1), GNU ld version 2.38-27.fc37) #1 SMP PREEMPT_DYNAMIC Tue May 30 15:43:51 UTC 2023

$ ifort --version
ifort (IFORT) 2021.9.0 20230302
Copyright (C) 1985-2023 Intel Corporation.  All rights reserved.

$ ifort -warn all -coarray -O2 -o test199-ifort.exe -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -ldl test199.f90

$ export FOR_COARRAY_NUM_IMAGES=2

$ test199-ifort.exe
 hello
 hello
 acs
 acs
 hello
 hello
 acs
 acs
         123
         123
         123
         123
   456.0000    
   456.0000    
   456.0000    
   456.0000    
   789.000000000000     
   789.000000000000     
   789.000000000000     
   789.000000000000     

$ ifx --version
ifx (IFX) 2023.1.0 20230320
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

$ ifx -warn all -coarray -O2 -o test199-ifx.exe -L${MKLROOT}/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -ldl test199.f90

$ export FOR_COARRAY_NUM_IMAGES=2
$ test199-ifx.exe
 hello
 hello
 acs
 hello
 hello
 acs
 acs
         123
         123
 acs
         123
         123
   456.0000    
   456.0000    
   456.0000    
   456.0000    
   789.000000000000     
   789.000000000000     
   789.000000000000     
   789.000000000000

module m1
  implicit none
  type :: node
    class (*), pointer :: item => null()
    contains
    final :: destructor
    procedure :: node_print
  end type node
  !
contains
  !
  function node_init (v) result (rst_node)
    class (*), intent(in) :: v
    type(node) :: rst_node
    allocate(rst_node%item, source = v)
  end function node_init
  !
  subroutine destructor (this)
    implicit none
    type(node), intent (inout) :: this
    if (associated(this%item)) then
      call this % node_print
      !call cprint_p(this%item)
      deallocate (this%item)
    end if
  end subroutine destructor
  !
  subroutine node_print (this)
    implicit none
    class (node), intent(in) :: this
    select type (v => this%item)
      type is (integer)
        print *, v
      type is (real(kind=4))
        print *, v
      type is (real(kind=8))
        print *, v
      type is (logical)
        print *, v
      type is (character(*))
        print *, v
      class default
        error stop "1324"
    end select
    !
  end subroutine node_print
end module m1
!
program test
  use m1
  implicit none
  type (node), pointer :: aa, bb, cc, dd, ee
  !
  allocate (aa, source = node_init ('hello'))
  !call cprint_p (aa % item)
  call aa % node_print
  !
  allocate (bb, source = node_init ('acs'))
  !call cprint_p (bb % item)
  call bb % node_print
  !
  allocate (cc, source = node_init (123))
  !call cprint_p(cc % item)
  call cc % node_print
  !
  allocate (dd, source = node_init (456.0_4))
  !call cprint_p (dd % item)
  call dd % node_print
  !
  allocate (ee, source = node_init (789.0_8))
  !call cprint_p (ee % item)
  call ee % node_print
  !
end program test

IanH · ‎06-07-2023

There's a call to the node_print binding in the finalizer and immediately after each allocate statement in the main program. Two "echoes" per image by two images gives four echoes.

(This variant of the program continues to dereference an undefined pointer after the finalizer has deallocated the item component.)

jdelia · ‎06-07-2023

Upsss: ... I missed the print inside the destructor...
Yes, yes. You're right: this variant continues to dereference an undefined pointer after the finalizer has deallocated the item component.Thanks!

jimdempseyatthecove · ‎06-07-2023

The fact that an improperly written program works under in-house testing does not mean it will continue work in production environment. Your original code was accessing data returned to the heap. As to if accessing the memory locations returned to the heap yield the expected data is pure happen-chance.

Jim Dempsey

FortranFan · ‎06-08-2023

@Roadelse ,

Note Fortran has limitations when it comes to generic programming. Trying to use unlimited polymorphism to overcome these limitations - which may be your pursuit - is fraught. Nonetheless, you can consider a following design alternative if you still want to dig deeper into OOP and unlimited polymorphism:

module m
   interface
      subroutine cprint_p( p ) bind(C, name="cprint_p_")
         type(*), intent(in) :: p(..)
      end subroutine
   end interface
   type :: node
      private
      class(*), allocatable :: item
   contains
      procedure :: node_print
      procedure :: node_init
   end type
contains
   subroutine node_init(this, v)
      class(node), intent(inout) :: this
      class(*), intent(in) :: v
      this%item = v
   end subroutine
   subroutine node_print(this)
      class(node), intent(in) :: this
      select type (v => this%item)
         type is (integer)
            print *, v
            call cprint_p( v )
         type is (real(kind=4))
            print *, v
         type is (real(kind=8))
            print *, v
            call cprint_p( v )
         type is (logical)
            print *, v
            call cprint_p( v )
         type is (character(*))
            print *, v
            call cprint_p( v )
         class default
            stop 1324
      end select
   end subroutine
end module
   use m, only : node
   type(node) :: a, b
   call a%node_init('hello')
   call a%node_print()
   call b%node_init('acs')
   call b%node_print()
end

C:\temp>p.exe
 hello
address for char*p is 000000793B10F900, for itself is 000000793B10F8D0
 acs
address for char*p is 000000793B10F900, for itself is 000000793B10F8D0

allocate a pointer with source of a constructed derived type leads to memory chaos

Runtime error