Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28499 Discussions

IFX: Segfault in allocation library with qopenmp

martinmath
New Contributor I
801 Views

Compiling the following code with "ifx -traceback -qopenmp -O2 segfault_alloc.f90 -o segfault_alloc" and executing sefault_alloc ends with a segmentation fault in line "deallocate(u%val)".

The ifx version used is "ifx (IFX) 2024.0.0 20231017".

 

module mod

implicit none
public

type :: string
   character(len=:), allocatable :: c
end type string

type :: t
   class(*), pointer :: val => null()
end type t

end module mod

!------------------

program segfault_alloc

use mod

implicit none

type(t), allocatable :: u

call test(u)
deallocate(u%val)
deallocate(u)

contains

subroutine test(x)
   type(t), allocatable, intent(out) :: x
   type(string) :: str

   str%c = 'str123'
   allocate(x)
   call alloc_p(str, x%val)
end subroutine test


subroutine alloc_p(from, to)
   class(*),          intent(in)  :: from
   class(*), pointer, intent(out) :: to

   select type(from)
   class is (string)
      allocate(to, source=from)
   end select
end subroutine alloc_p

end program segfault_alloc

 

The code has been reduced from a large project, and does not make much sense in this reduced form. Also the segfault is rather fragile, one small change (like omitting command line option "traceback") and it is gone. However, if no essential changes are done, valgrind consistently shows "conditional jump depends on uninitialised value" and "invalid read of size..." errors in "process_allocation_records_deallocate", probably from the runtime libraries, even if it does not abort with a segfault.

Furthermore, I cannot really pin the error to any specific ingredient. I got the feeling that changing the code changes the stack layout (valgrind: "Uninitialised value was created by a stack allocation") and with this the fact whether segfaults or valgrind errors are triggered or not. So hopefully this serves as a suitable reproducer!

PS: Note that this code (and all of the variations I have tried) runs flawlessly with ifort as well as gfortran. valgrind never complained.

6 Replies
Ron_Green
Moderator
774 Views

I think you may be correct about a problem when stack allocation is used ( -qopenmp and -traceback being suspicious clues ).

I am writing up a bug report on this.

Ron_Green
Moderator
749 Views

the bug ID is CMPLRLLVM-54166


0 Kudos
Ron_Green
Moderator
554 Views

Martin,

 

We have isolated and root caused this bug. We are working on a fix for the 2024.2 release which will come out mid-year.

 

UNTIL then I can offer you a workaround.

What I am about to share is an undocumented compiler option. I have tried this with the 2024.0 and a prerelease version of 2024.1. This works for both compiler version.

Undocumented options are to be used only for special cases. Once the fix for this bug is released, you should stop using the following option.

Also, we do not recommend using undocumented options for production code releases. In the end it's your choice, but we do not recommend using these options for production releases.

That said:

Linux:

-switch disable_templ_xdesc_in_rtn

Windows:

/switch:disable_templ_xdesc_in_rtn

 

I have tested this with the following results - before the option, they after using the option

 

$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 repro.f90 ; ./a.out

Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017

Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

 

 Intel(R) Fortran 24.0-1238.2

GNU ld version 2.39-9.fc38

 allocate STAT is      0

forrtl: severe (174): SIGSEGV, segmentation fault occurred

Image       PC        Routine      Line    Source       

libc.so.6     00007F94A585FB70 Unknown        Unknown Unknown

a.out       000000000040AF02 Unknown        Unknown Unknown

a.out       000000000040A131 Unknown        Unknown Unknown

a.out       0000000000405346 segfault_alloc       28 repro.f90

a.out       000000000040521D Unknown        Unknown Unknown

libc.so.6     00007F94A5849B4A Unknown        Unknown Unknown

libc.so.6     00007F94A5849C0B __libc_start_main   Unknown Unknown

a.out       0000000000405135 Unknown        Unknown Unknown

$ rm a.out ; ifx -what -V -qopenmp -g -traceback -O0 -switch disable_templ_xdesc_in_rtn repro.f90 ; ./a.out

Intel(R) Fortran Compiler for applications running on Intel(R) 64, Version 2024.0.0 Build 20231017

Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

 

 Intel(R) Fortran 24.0-1238.2

GNU ld version 2.39-9.fc38

 allocate STAT is      0

 SUCCESS, no segfault

 

 

 

program segfault_alloc

 

use mod

 

implicit none

 

type(t), allocatable :: u

 

call test(u)

!..... this deallocate will cause segfault.....

deallocate(u%val)

!..... this deallocate will not cause segfault

deallocate(u)

print*, "SUCCESS, no segfault"

contains

 

subroutine test(x)

  type(t), allocatable, intent(out) :: x

  type(string) :: str

 

  str%c = 'str123'

  allocate(x)

  call alloc_p(str, x%val)

end subroutine test

 

subroutine alloc_p(from, to)

  class(*),     intent(in) :: from

  class(*), pointer, intent(out) :: to

  integer :: err

  select type(from)

  class is (string)

   allocate(to, source=from, stat=err)

   print*, "allocate STAT is ", err

  end select

end subroutine alloc_p

 

end program segfault_alloc

 

martinmath
New Contributor I
436 Views

Thanks for the work-around, I somehow missed that reply. I will give it a try.

0 Kudos
martinmath
New Contributor I
227 Views

I tested with this work-around option (using the recently released version) and it works. It is the first time that I was able to run our software with ifx. Performance looks mostly good as well. There are two geometrical task which are 15% and 40% slower (delaunay mesh and kdtree based neighbourhood graph computation), but this are minor tasks. Runtime of core numerical code is on par with ifort.

But I was mostly surpised about the huge (=8x!) performance improvements with IO. Has there been any particular optimisation, which causes this big improvement? Is there any kind of buffering plus asyncronuously performed close involved?

We mostly avoid calling write and instead use buffering and write only big chunks, because the write statement used to be really slow both in ifort as well as gfortran. Moreover we already use the ifort-buffering option with a big buffer in the open statement.

In some performance tests I did recently, IO time was mostly lost in the close statement both for gfortran as well as ifort. Obviously, the runtime was waiting for the system to actually flush the buffered data.

Anyway, thanks for the work!

 

Ron_Green
Moderator
201 Views

@martinmath 

We are at code freeze today for the Update 2 compiler due out in early July.  The fix for this issue is in an early build, I tested your example, it passes without the undocumented option.  Now if only we could get this to you faster than late June or early July.

0 Kudos
Reply