- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Compiling the following code with "ifx -traceback -qopenmp -O2 segfault_alloc.f90 -o segfault_alloc" and executing sefault_alloc ends with a segmentation fault in line "deallocate(u%val)".
The ifx version used is "ifx (IFX) 2024.0.0 20231017".
module mod
implicit none
public
type :: string
character(len=:), allocatable :: c
end type string
type :: t
class(*), pointer :: val => null()
end type t
end module mod
!------------------
program segfault_alloc
use mod
implicit none
type(t), allocatable :: u
call test(u)
deallocate(u%val)
deallocate(u)
contains
subroutine test(x)
type(t), allocatable, intent(out) :: x
type(string) :: str
str%c = 'str123'
allocate(x)
call alloc_p(str, x%val)
end subroutine test
subroutine alloc_p(from, to)
class(*), intent(in) :: from
class(*), pointer, intent(out) :: to
select type(from)
class is (string)
allocate(to, source=from)
end select
end subroutine alloc_p
end program segfault_alloc
The code has been reduced from a large project, and does not make much sense in this reduced form. Also the segfault is rather fragile, one small change (like omitting command line option "traceback") and it is gone. However, if no essential changes are done, valgrind consistently shows "conditional jump depends on uninitialised value" and "invalid read of size..." errors in "process_allocation_records_deallocate", probably from the runtime libraries, even if it does not abort with a segfault.
Furthermore, I cannot really pin the error to any specific ingredient. I got the feeling that changing the code changes the stack layout (valgrind: "Uninitialised value was created by a stack allocation") and with this the fact whether segfaults or valgrind errors are triggered or not. So hopefully this serves as a suitable reproducer!
PS: Note that this code (and all of the variations I have tried) runs flawlessly with ifort as well as gfortran. valgrind never complained.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think you may be correct about a problem when stack allocation is used ( -qopenmp and -traceback being suspicious clues ).
I am writing up a bug report on this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the bug ID is CMPLRLLVM-54166
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Martin,
We have isolated and root caused this bug. We are working on a fix for the 2024.2 release which will come out mid-year.
UNTIL then I can offer you a workaround.
What I am about to share is an undocumented compiler option. I have tried this with the 2024.0 and a prerelease version of 2024.1. This works for both compiler version.
Undocumented options are to be used only for special cases. Once the fix for this bug is released, you should stop using the following option.
Also, we do not recommend using undocumented options for production code releases. In the end it's your choice, but we do not recommend using these options for production releases.
That said:
Linux:
-switch disable_templ_xdesc_in_rtn
Windows:
/switch:disable_templ_xdesc_in_rtn
I have tested this with the following results - before the option, they after using the option
$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 repro.f90 ; ./a.out
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.
Intel(R) Fortran 24.0-1238.2
GNU ld version 2.39-9.fc38
allocate STAT is 0
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
libc.so.6 00007F94A585FB70 Unknown Unknown Unknown
a.out 000000000040AF02 Unknown Unknown Unknown
a.out 000000000040A131 Unknown Unknown Unknown
a.out 0000000000405346 segfault_alloc 28 repro.f90
a.out 000000000040521D Unknown Unknown Unknown
libc.so.6 00007F94A5849B4A Unknown Unknown Unknown
libc.so.6 00007F94A5849C0B __libc_start_main Unknown Unknown
a.out 0000000000405135 Unknown Unknown Unknown
$
$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 -switch disable_templ_xdesc_in_rtn repro.f90 ; ./a.out
Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.
Intel(R) Fortran 24.0-1238.2
GNU ld version 2.39-9.fc38
allocate STAT is 0
SUCCESS, no segfault
program segfault_alloc
use mod
implicit none
type(t), allocatable :: u
call test(u)
!..... this deallocate will cause segfault.....
deallocate(u%val)
!..... this deallocate will not cause segfault
deallocate(u)
print*, "SUCCESS, no segfault"
contains
subroutine test(x)
type(t), allocatable, intent(out) :: x
type(string) :: str
str%c = 'str123'
allocate(x)
call alloc_p(str, x%val)
end subroutine test
subroutine alloc_p(from, to)
class(*), intent(in) :: from
class(*), pointer, intent(out) :: to
integer :: err
select type(from)
class is (string)
allocate(to, source=from, stat=err)
print*, "allocate STAT is ", err
end select
end subroutine alloc_p
end program segfault_alloc
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the work-around, I somehow missed that reply. I will give it a try.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tested with this work-around option (using the recently released version) and it works. It is the first time that I was able to run our software with ifx. Performance looks mostly good as well. There are two geometrical task which are 15% and 40% slower (delaunay mesh and kdtree based neighbourhood graph computation), but this are minor tasks. Runtime of core numerical code is on par with ifort.
But I was mostly surpised about the huge (=8x!) performance improvements with IO. Has there been any particular optimisation, which causes this big improvement? Is there any kind of buffering plus asyncronuously performed close involved?
We mostly avoid calling write and instead use buffering and write only big chunks, because the write statement used to be really slow both in ifort as well as gfortran. Moreover we already use the ifort-buffering option with a big buffer in the open statement.
In some performance tests I did recently, IO time was mostly lost in the close statement both for gfortran as well as ifort. Obviously, the runtime was waiting for the system to actually flush the buffered data.
Anyway, thanks for the work!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are at code freeze today for the Update 2 compiler due out in early July. The fix for this issue is in an early build, I tested your example, it passes without the undocumented option. Now if only we could get this to you faster than late June or early July.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page