We came across a segmentation fault in our software that is written in Fortran90 and were unable to determine its cause. The error did not show with gfortran, but we can reproduce it with ifort 14.0, 15.0.2 and 16.0.2 on two different machines, one running Ubuntu, the other one running Scientific Linux. The segmentation fault occurs only if runtime checks are enabled, and only if the code is compiled with the -openmp (or -qopenmp) flag, and if OpenMP pragmas are present in the code (although at a line that was probably auto-parallelised, as it is outside all parallel regions). We also ran the program in Valgrind which gives a warning in the offending line, shown below.
The error occurs only if we declare a static array of a derived datatype (called typ_t) that has a large number of allocatable members inside it.
type(typ_t), dimension(1) :: a ! causes a segfault
Our first theory was that we are running out of stack space, so we used the -heap-arrays flag (which did not change anything) and increased the stack size (ulimit -s unlimted) and the OpenMP stack size (export OMP_STACKSIZE=100M), but still got the segmentation fault.
Strangely, if the number of allocatable members inside the derived datatype is large enough, the segmentation fault occurs if we use an array of size 1, but disappears if we just have a scalar variable of that type:
type(typ_t) :: a ! this works
This made us think that maybe it is the heap space that is exhausted, however making the array allocatable also made the problem disappear:
type(typ_t), dimension(:), allocatable :: a allocate(a(1)) ! this works
The full code to reproduce this issue and the error and warning messages we receive are shown below.
program foo use para_m implicit none type(typ_t), dimension(1) :: a integer :: omp_threads !$OMP PARALLEL PRIVATE(omp_threads) omp_threads=24 !$OMP END PARALLEL allocate(a(1)%b1(6)) print*,size(a),size(a(1)%b1) a(1)%b1=0.d0 end program foo module para_m type :: typ_t real(8), dimension(:), allocatable :: b1 real(8), dimension(:), allocatable :: b2 . . . real(8), dimension(:), allocatable :: b455 real(8), dimension(:), allocatable :: b456 end type typ_t end module para_m
a) compile in debug mode:
- compilation output:
ifort -c ../src/L00_base/para_m.f90 -O0 -debug all -check all -traceback -g -r8 -qopenmp -no-wrap-margin -heap-arrays
ifort -c ../src/L10_program/foo.f90 -O0 -debug all -check all -traceback -g -r8 -qopenmp -no-wrap-margin -heap-arrays
ifort para_m.o foo.o -o foo -O0 -debug all -check all -traceback -g -r8 -qopenmp -no-wrap-margin -heap-arrays
- runing the code results in the following:
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
foo 000000000047C2F5 Unknown Unknown Unknown
foo 0000000000479F17 Unknown Unknown Unknown
foo 000000000042C9B4 Unknown Unknown Unknown
foo 000000000042C7C6 Unknown Unknown Unknown
foo 0000000000405669 Unknown Unknown Unknown
foo 0000000000408DC0 Unknown Unknown Unknown
libpthread.so.0 00007F75AA2C17E0 Unknown Unknown Unknown
foo 0000000000403C68 MAIN__ 13 foo.f90
foo 00000000004035CE Unknown Unknown Unknown
libc.so.6 00007F75A9D38D5D Unknown Unknown Unknown
foo 0000000000403469 Unknown Unknown Unknown
b) compile with optimisation etc:
- compilation output:
ifort -c ../src/L00_base/para_m.f90 -O3 -r8 -qopenmp -g -no-wrap-margin -par-threshold10 -heap-arrays
ifort -c ../src/L10_program/foo.f90 -O3 -r8 -qopenmp -g -no-wrap-margin -par-threshold10 -heap-arrays
ifort para_m.o foo.o -o foo -O3 -r8 -qopenmp -g -no-wrap-margin -par-threshold10 -heap-arrays
- run result in no segf
The segmentation fault is gone when either of below changes is done:
1. when OMP pragmas are removed
2. number of allocatables inside typ_t is =< 456 (it segfaults for 457)
3. when 'a' is not an array, i.e. type(typ_t) :: a
4. when the declaration of 'a' is allocatable
Thanks for your report. I was able to reproduce the behavior for 456 or more allocatable arrays in the derived type.
The problem seems to be triggered by the stack frame checking. If you add -check nostack to the end of your command line, it should work. (For simplicity, I had module and main in a single source file, and compiled and linked from a single command line). -check stack or -check all overrides the optimization level and sets it to -O0.
It looks like there's some interaction between setting up the stacks for OpenMP, the derived type instance going on the stack and the stack checking with -check stack. We'll escalate this to the compiler developers and see what they conclude.