IFX: Segfault in allocation library with qopenmp

martinmath · ‎12-04-2023

Compiling the following code with "ifx -traceback -qopenmp -O2 segfault_alloc.f90 -o segfault_alloc" and executing sefault_alloc ends with a segmentation fault in line "deallocate(u%val)".

The ifx version used is "ifx (IFX) 2024.0.0 20231017".

module mod

implicit none
public

type :: string
   character(len=:), allocatable :: c
end type string

type :: t
   class(*), pointer :: val => null()
end type t

end module mod

!------------------

program segfault_alloc

use mod

implicit none

type(t), allocatable :: u

call test(u)
deallocate(u%val)
deallocate(u)

contains

subroutine test(x)
   type(t), allocatable, intent(out) :: x
   type(string) :: str

   str%c = 'str123'
   allocate(x)
   call alloc_p(str, x%val)
end subroutine test


subroutine alloc_p(from, to)
   class(*),          intent(in)  :: from
   class(*), pointer, intent(out) :: to

   select type(from)
   class is (string)
      allocate(to, source=from)
   end select
end subroutine alloc_p

end program segfault_alloc

The code has been reduced from a large project, and does not make much sense in this reduced form. Also the segfault is rather fragile, one small change (like omitting command line option "traceback") and it is gone. However, if no essential changes are done, valgrind consistently shows "conditional jump depends on uninitialised value" and "invalid read of size..." errors in "process_allocation_records_deallocate", probably from the runtime libraries, even if it does not abort with a segfault.

Furthermore, I cannot really pin the error to any specific ingredient. I got the feeling that changing the code changes the stack layout (valgrind: "Uninitialised value was created by a stack allocation") and with this the fact whether segfaults or valgrind errors are triggered or not. So hopefully this serves as a suitable reproducer!

PS: Note that this code (and all of the variations I have tried) runs flawlessly with ifort as well as gfortran. valgrind never complained.

Ron_Green · ‎12-04-2023

I think you may be correct about a problem when stack allocation is used ( -qopenmp and -traceback being suspicious clues ).

I am writing up a bug report on this.

Ron_Green · ‎12-04-2023

the bug ID is CMPLRLLVM-54166

Ron_Green · ‎01-25-2024

Martin,

We have isolated and root caused this bug. We are working on a fix for the 2024.2 release which will come out mid-year.

UNTIL then I can offer you a workaround.

What I am about to share is an undocumented compiler option. I have tried this with the 2024.0 and a prerelease version of 2024.1. This works for both compiler version.

Undocumented options are to be used only for special cases. Once the fix for this bug is released, you should stop using the following option.

Also, we do not recommend using undocumented options for production code releases. In the end it's your choice, but we do not recommend using these options for production releases.

That said:

Linux:

-switch disable_templ_xdesc_in_rtn

Windows:

/switch:disable_templ_xdesc_in_rtn

I have tested this with the following results - before the option, they after using the option

$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 repro.f90 ; ./a.out

Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017

Intel(R) Fortran 24.0-1238.2

GNU ld version 2.39-9.fc38

allocate STAT is 0

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image PC Routine Line Source

libc.so.6 00007F94A585FB70 Unknown Unknown Unknown

a.out 000000000040AF02 Unknown Unknown Unknown

a.out 000000000040A131 Unknown Unknown Unknown

a.out 0000000000405346 segfault_alloc 28 repro.f90

a.out 000000000040521D Unknown Unknown Unknown

libc.so.6 00007F94A5849B4A Unknown Unknown Unknown

libc.so.6 00007F94A5849C0B __libc_start_main Unknown Unknown

a.out 0000000000405135 Unknown Unknown Unknown

$

$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 -switch disable_templ_xdesc_in_rtn repro.f90 ; ./a.out

Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017

Intel(R) Fortran 24.0-1238.2

GNU ld version 2.39-9.fc38

allocate STAT is 0

SUCCESS, no segfault

program segfault_alloc

use mod

implicit none

type(t), allocatable :: u

call test(u)

!..... this deallocate will cause segfault.....

deallocate(u%val)

!..... this deallocate will not cause segfault

deallocate(u)

print*, "SUCCESS, no segfault"

contains

subroutine test(x)

type(t), allocatable, intent(out) :: x

type(string) :: str

str%c = 'str123'

allocate(x)

call alloc_p(str, x%val)

end subroutine test

subroutine alloc_p(from, to)

class(*), intent(in) :: from

class(*), pointer, intent(out) :: to

integer :: err

select type(from)

class is (string)

allocate(to, source=from, stat=err)

print*, "allocate STAT is ", err

end select

end subroutine alloc_p

end program segfault_alloc

martinmath · ‎02-09-2024

Thanks for the work-around, I somehow missed that reply. I will give it a try.

martinmath · ‎04-24-2024

I tested with this work-around option (using the recently released version) and it works. It is the first time that I was able to run our software with ifx. Performance looks mostly good as well. There are two geometrical task which are 15% and 40% slower (delaunay mesh and kdtree based neighbourhood graph computation), but this are minor tasks. Runtime of core numerical code is on par with ifort.

But I was mostly surpised about the huge (=8x!) performance improvements with IO. Has there been any particular optimisation, which causes this big improvement? Is there any kind of buffering plus asyncronuously performed close involved?

We mostly avoid calling write and instead use buffering and write only big chunks, because the write statement used to be really slow both in ifort as well as gfortran. Moreover we already use the ifort-buffering option with a big buffer in the open statement.

In some performance tests I did recently, IO time was mostly lost in the close statement both for gfortran as well as ifort. Obviously, the runtime was waiting for the system to actually flush the buffered data.

Anyway, thanks for the work!

Ron_Green · ‎04-24-2024

@martinmath

We are at code freeze today for the Update 2 compiler due out in early July. The fix for this issue is in an early build, I tested your example, it passes without the undocumented option. Now if only we could get this to you faster than late June or early July.