- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
The following problem demonstrates a bug I found for arrays of derived types with allocatable components.
module orb_idx_mod implicit none private public :: SpinOrbIdx_t, buggy_split, split, size !> We assume order [beta_1, alpha_1, beta_2, alpha_2, ...] type :: SpinOrbIdx_t integer, allocatable :: idx(:) end type interface size module procedure size_OrbIdx_t end interface contains !> Split the spin orbitals into size(lengths) chunks !> It is assumed, that sum(lengths) == size(orbs) pure function buggy_split(orbs, lengths) result(res) type(SpinOrbIdx_t), intent(in) :: orbs integer, intent(in) :: lengths(:) ! Bug occurs if result is automatic array type(SpinOrbIdx_t) :: res(size(lengths)) integer :: i, prev if (sum(lengths) /= size(orbs)) error stop prev = lbound(orbs%idx, 1) do i = lbound(lengths, 1), ubound(lengths, 1) res(i)%idx = orbs%idx(prev : prev + lengths(i) - 1) prev = prev + lengths(i) end do end function !> Split the spin orbitals into size(lengths) chunks !> It is assumed, that sum(lengths) == size(orbs) pure function split(orbs, lengths) result(res) type(SpinOrbIdx_t), intent(in) :: orbs integer, intent(in) :: lengths(:) ! Bug occurs if result is automatic array type(SpinOrbIdx_t), allocatable :: res(:) integer :: i, prev allocate(res(size(lengths))) if (sum(lengths) /= size(orbs)) error stop prev = lbound(orbs%idx, 1) do i = lbound(lengths, 1), ubound(lengths, 1) res(i)%idx = orbs%idx(prev : prev + lengths(i) - 1) prev = prev + lengths(i) end do end function integer pure function size_OrbIdx_t(orbs) type(SpinOrbIdx_t), intent(in) :: orbs size_OrbIdx_t = size(orbs%idx) end function end module program test_ifort_bug use orb_idx_mod, only: SpinOrbIdx_t, split, buggy_split implicit none type(SpinOrbIdx_t), allocatable :: splitted_orbs(:) integer :: i splitted_orbs = split(SpinOrbIdx_t([1, 2, 3, 4, 5, 6]), [1, 0, 3, 2]) do i = lbound(splitted_orbs, 1), ubound(splitted_orbs, 1) write(*, *) splitted_orbs(i)%idx end do write(*, *) '========================================' write(*, *) 'In the next statement a segfault happens' write(*, *) '========================================' splitted_orbs = buggy_split(SpinOrbIdx_t([1, 2, 3, 4, 5, 6]), [1, 0, 3, 2]) do i = lbound(splitted_orbs, 1), ubound(splitted_orbs, 1) write(*, *) splitted_orbs(i)%idx end do end program
If I compile this with `ifort (IFORT) 18.0.5 20180823` the code crashes upon returning from `buggy_split` in the runtime library.
Program received signal SIGSEGV, Segmentation fault. 0x000000000040ba36 in do_deallocate_all () (gdb) s Single stepping until exit from function do_deallocate_all, which has no line number information. 0x0000000000413de0 in for.signal_handler ()
The crash appears even with debug options (-O0 -check all -g -debug all) but disappears for -heap-arrays!
If I compile the code with gfortran and run it in valgrind, I don't see anything suspicious.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just compiled ran this code successfully with ifort19.1.0.166. Can you update your compiler version to the current release?
$ ifort -O0 -check all -g -debug all orb.f90 $ a.out 1 2 3 4 5 6 ======================================== In the next statement a segfault happens ======================================== 1 2 3 4 5 6
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Barbara,
Thank you for your answer.
I spent a lot of time to find this problem in our code and narrow it down into this minimal example, so I would like to know for sure, if this is
- Valid code, ifort 18 had a bug, and it is patched in ifort 19.
- Valid code, ifort 18 had a bug, it is not patched in ifort 19 and we were lucky that it runs there.*
- Undefined behaviour and even if it's running under ifort 19, there will be dragons flying out of my nose in ifort 20. ;-)
Best,
Oskar
* It runs with -heap-arrays under ifort 18, which suggests a stack overflow (although the arrays were rather small). If it crashes with two more elements under ifort19 it does not help me.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just to confuse all of us, the Fortran compiler available in Parallel Studio XE 2020 is 19.1.0. That is the current release.
I can appreciate the effort it takes to come up with a small reproducer that exhibits the same problem. That's part of my job!
I see the SIGSEGV with 18.0.5. With 19.0.5 it runs to a normal completion, as it does with 19.1.0.
Seems to me like a problem was fixed.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okidoki, thank you very much for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is probably a comment directed at Steve L.
I do not think your code is conformant Fortran.
function buggy_split does not declare res as allocatable and thus would assume that the output has the declared storage available (and potentially has the same shape and size(s) as specified in the function).
The return argument res cannot be a local automatic array (it would be on the stack of the function buggy_split and not available to the caller upon return).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My thoughts are buggy_split as-was written as returning a blob of SpinOrbit_t types whose size is not known until runtime. IMHO what it needs to return use is an array descriptor (external to the function). While the "automatic" (stack) may work at times, it is relying on the returned stack data being available post-call, and thus is unsafe.
It will be interesting to see if Steve L comments on the (he is a member of the Fortran Standards committee).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
res is declared with a specification expression that is evaluated at the call site. The compiler then generates a temporary (on the stack, usually) of the appropriate size and passes the address of that temp as a hidden argument to the function. This requires an explicit interface, which is provided.
All perfectly conforming.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page