Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29106 Discussions

DO CONCURRENT segmentation fault with ifx (IFX) 2023.2.0 20230721

jvo203
New Contributor I
966 Views

This problem is related to a recently fixed issue "ifx OpenMP compilation segmentation fault".

The above fix corrected the compilation segmentation fault. So now that the code compiles, there is a new segmentation fault at runtime that only occurs with the OpenMP compiler flag switched on. With the OpenMP flag on, the following executable either hangs indefinitely or causes a segmentation fault.

chris@capricorn:~/projects/FITSWEBQLSE/tests> gdb-oneapi ./intel_fixed_array
GNU gdb (Intel(R) Distribution for GDB* 2023.2.0) 13.1
(...)
Reading symbols from ./intel_fixed_array...
(gdb) run
Starting program: /mnt/data/chris/projects/FITSWEBQLSE/tests/intel_fixed_array
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
 INTEL FORTRAN COMPILER TEST
  3.9208680E-07  2.5480442E-02  0.3525161      0.6669145      0.9630555
  0.6550830      0.8503483      0.9033895      0.8463544      0.2066242
  0.9755553      0.5635241      0.9653679      0.6683893      0.5513104
  0.2743225      0.5902489      0.1112262      0.2085747      0.8165010
  6.4814657E-02  0.4445370      0.6465091      0.5573670      0.6079766
  0.5310045      0.2221855      0.6462902      0.6925085      0.8624426
  0.5792798      0.3475854      0.7642347      0.9242435      0.5887938
  0.8071396      0.3204502      0.2661751      0.3978070      2.5348928E-02
  0.9314515      0.4619053      3.0069806E-02  0.5978217      0.4327033
  0.8484681      0.4614979      0.2715716      0.3194994      0.6445942
  0.3667653      0.8651252      0.7601541      6.7160085E-02  0.9656122
  0.8827326      0.3787658      7.8151032E-02  0.2350498      0.7156371
  0.5803360      0.6305379      0.9882953      0.2708526      0.3726243
  0.2573928      0.9639326      0.2548033      2.2417052E-02  0.2447754
  0.9612966      0.9319553      7.5711273E-03  0.7335588      0.6810908
  0.5873324      0.7246961      2.0496412E-03  0.5703991      5.0798532E-02
  0.2780795      0.7370093      0.5495721      0.1198158      0.9510633
  0.5026427      0.8320574      0.6777883      1.3498955E-02  0.8210381
  0.4603336      0.2004648      7.5419076E-02  0.2348041      0.8084300
  0.8045866      4.2414516E-02  9.4963923E-02  0.8356062      0.3440044
  0.6607602      0.3184225      2.6833560E-02  0.4169126      0.9349306
  0.7938036      0.9250711      0.7045951      0.3796302      0.9276345
  4.0213659E-02  0.1359038      0.5310917      2.7534172E-02  4.9371131E-02
  0.1852648      0.9689460      0.4393423      0.8226858      0.2452050
  0.3377560      6.1524317E-02  0.8468842      2.9195992E-02  0.9605215
OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0-3
OMP: Info #217: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #157: KMP_AFFINITY: 4 available OS procs
OMP: Info #158: KMP_AFFINITY: Uniform topology
OMP: Info #288: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket".
OMP: Info #288: KMP_AFFINITY: topology layer "L3 cache" is equivalent to "socket".
OMP: Info #288: KMP_AFFINITY: topology layer "L2 cache" is equivalent to "core".
OMP: Info #288: KMP_AFFINITY: topology layer "L1 cache" is equivalent to "core".
OMP: Info #192: KMP_AFFINITY: 1 socket x 4 cores/socket x 1 thread/core (4 total cores)
OMP: Info #219: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0
OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 1 thread 0
OMP: Info #172: KMP_AFFINITY: OS proc 2 maps to socket 0 core 2 thread 0
OMP: Info #172: KMP_AFFINITY: OS proc 3 maps to socket 0 core 3 thread 0
OMP: Info #255: KMP_AFFINITY: pid 9178 tid 9178 thread 0 bound to OS proc set 0
[New Thread 0x7fff9ebfa740 (LWP 9181)]
OMP: Info #255: KMP_AFFINITY: pid 9178 tid 9181 thread 1 bound to OS proc set 1
[New Thread 0x7fff9e3f87c0 (LWP 9182)]
OMP: Info #255: KMP_AFFINITY: pid 9178 tid 9182 thread 2 bound to OS proc set 2
[New Thread 0x7fff9dbf6840 (LWP 9183)]
OMP: Info #255: KMP_AFFINITY: pid 9178 tid 9183 thread 3 bound to OS proc set 3
 
Thread 4 "intel_fixed_arr" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff9dbf6840 (LWP 9183)]
main::to_fixed_block (x=..., compressed=..., datamin=<optimized out>, datamax=<optimized out>) at intel_fixed_array.f90:181
181      compressed%mantissa = quantize(x, e, max_exp, significant_bits)
(gdb) bt
#0  main::to_fixed_block (x=..., compressed=..., datamin=<optimized out>, datamax=<optimized out>) at intel_fixed_array.f90:181
#1  0x0000000000407d24 in main_IP_to_fixed_.DIR.OMP.DISTRIBUTE.PARLOOP.8.split376.split () at intel_fixed_array.f90:110
#2  0x00007ffff7563493 in __kmp_invoke_microtask () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#3  0x00007ffff74d44a7 in _INTERNAL49d8b4ea::__kmp_fork_in_teams (loc=<optimized out>, gtid=<optimized out>, parent_team=<optimized out>, argc=<optimized out>, master_th=<optimized out>,
    root=<optimized out>, call_context=<optimized out>, microtask=<optimized out>, invoker=<optimized out>, master_set_numthreads=<optimized out>, level=<optimized out>, ompt_parallel_data=...,
    return_address=<optimized out>, ap=<optimized out>) at ../../src/kmp_runtime.cpp:1679
#4  __kmp_fork_call (loc=0x80000083, gtid=-140509436, call_context=(fork_context_intel | fork_context_last | unknown: 0xfffc), argc=-140509404, microtask=0x1, invoker=0x0, ap=0x7fff9dbf4be0)
    at ../../src/kmp_runtime.cpp:2216
#5  0x00007ffff7489d23 in __kmpc_fork_call (loc=0x80000083, argc=-140509436, microtask=0xffff) at ../../src/kmp_csupport.cpp:350
#6  0x00000000004080ca in main_IP_to_fixed_.DIR.OMP.TEAMS.2.split390 () at intel_fixed_array.f90:94
#7  0x00007ffff7563493 in __kmp_invoke_microtask () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#8  0x00007ffff74d2e9c in _INTERNAL49d8b4ea::__kmp_serial_fork_call (loc=<optimized out>, gtid=<optimized out>, call_context=<optimized out>, argc=<optimized out>, microtask=<optimized out>,
    invoker=<optimized out>, master_th=<optimized out>, parent_team=<optimized out>, ompt_parallel_data=<optimized out>, return_address=<optimized out>, parent_task_data=<optimized out>,
    ap=<optimized out>) at ../../src/kmp_runtime.cpp:1903
#9  __kmp_fork_call (loc=0x80000083, gtid=-140509436, call_context=(fork_context_intel | fork_context_last | unknown: 0xfffc), argc=-140509404, microtask=0x1, invoker=0x0, ap=0x0)
    at ../../src/kmp_runtime.cpp:2329
#10 0x00007ffff74d1827 in __kmp_teams_master (gtid=-2147483517) at ../../src/kmp_runtime.cpp:8337
#11 0x00007ffff74d1723 in __kmp_invoke_teams_master (gtid=-2147483517) at ../../src/kmp_runtime.cpp:8378
#12 0x00007ffff74d2d59 in _INTERNAL49d8b4ea::__kmp_serial_fork_call (loc=<optimized out>, gtid=<optimized out>, call_context=<optimized out>, argc=<optimized out>, microtask=<optimized out>,
    invoker=<optimized out>, master_th=<optimized out>, parent_team=<optimized out>, ompt_parallel_data=<optimized out>, return_address=<optimized out>, parent_task_data=<optimized out>,
    ap=<optimized out>) at ../../src/kmp_runtime.cpp:1945
#13 __kmp_fork_call (loc=0x80000083, gtid=-140509436, call_context=(fork_context_intel | fork_context_last | unknown: 0xfffc), argc=-140509404, microtask=0x1, invoker=0x0, ap=0x7fff9dbf5510)
    at ../../src/kmp_runtime.cpp:2329
#14 0x00007ffff748a033 in __kmpc_fork_teams (loc=0x80000083, argc=-140509436, microtask=0xffff) at ../../src/kmp_csupport.cpp:499
#15 0x00000000004088c5 in main::to_fixed (compressed=<error reading variable: value requires 211410 bytes, which is more than max-value-size>,
    x=<error reading variable: value requires 705600 bytes, which is more than max-value-size>, datamin=0, datamax=1) at intel_fixed_array.f90:94
#16 0x0000000000407b3c in MAIN__.DIR.OMP.PARALLEL.LOOP.2591.split596 () at intel_fixed_array.f90:54
#17 0x00007ffff7563493 in __kmp_invoke_microtask () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#18 0x00007ffff74d1533 in __kmp_invoke_task_func (gtid=-2147483517) at ../../src/kmp_runtime.cpp:8273
#19 0x00007ffff74d0470 in __kmp_launch_thread (this_thr=0x80000083) at ../../src/kmp_runtime.cpp:6648
#20 0x00007ffff75641ff in _INTERNAL1ebb3278::__kmp_launch_worker (thr=0x80000083) at ../../src/z_Linux_util.cpp:559
#21 0x00007ffff7290c64 in start_thread () from /lib64/libc.so.6
#22 0x00007ffff7318550 in clone3 () from /lib64/libc.so.6
(gdb) q
A debugging session is active.
 
Inferior 1 [process 9178] will be killed.
 
Quit anyway? (y or n) y

 

chris@capricorn:~/projects/FITSWEBQLSE/tests> ifx --version
ifx (IFX) 2023.2.0 20230721
Copyright (C) 1985-2023 Intel Corporation. All rights reserved.

The problem does not occur with the classic Intel Fortran compiler ifort. ifort-compiled code executes flawlessly. There are no problems with gfortran either.

Attached please find a short Fortran code that reproduces the issue (file intel_fixed_array.f90). As the Makefile cannot be uploaded to your system, here is its content:

=======================

.DEFAULT_GOAL := intel

# GCC FORTRAN
FORT = gfortran
FLAGS = -march=native -g -Ofast -fPIC -fno-finite-math-only -funroll-loops -ftree-vectorize -fopenmp

# Intel FORTRAN
IFORT = ifx # ifort or ifx
IFLAGS = -g -Ofast -xHost -axAVX -qopt-report=2 -qopenmp

gcc:
$(FORT) $(FLAGS) -o intel_fixed_array intel_fixed_array.f90

intel:
$(IFORT) $(IFLAGS) -o intel_fixed_array intel_fixed_array.f90

clean:
rm -f intel_fixed_array
=======================
 
FYI: the Fortran code in intel_fixed_array.f90 appears to be clean and free of obvious memory bugs (at least according to valgrind). The 64-bit openSUSE Tumbleweed Linux is on the very latest oneAPI 2023.2.1.
Labels (3)
0 Kudos
3 Replies
jvo203
New Contributor I
954 Views

P.S. Setting the OMP_STACKSIZE even to 10GB does not help at all, the segmentation fault persists, i.e.:

 

Thread 3 "intel_fixed_arr" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fff5ebf87c0 (LWP 20051)]
0x00007ffff758074d in rml::internal::BackRefMain::findFreeBlock() () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
(gdb) bt
#0 0x00007ffff758074d in rml::internal::BackRefMain::findFreeBlock() () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#1 0x00007ffff758059c in rml::internal::BackRefIdx::newBackRef(bool) () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#2 0x00007ffff7574290 in rml::internal::ExtMemoryPool::mallocLargeObject(rml::internal::MemoryPool*, unsigned long) ()
from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#3 0x00007ffff756edb3 in rml::internal::MemoryPool::getFromLLOCache(rml::internal::TLSData*, unsigned long, unsigned long) ()
from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#4 0x00007ffff756f727 in scalable_aligned_malloc () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#5 0x000000000041a050 in for_allocate_handle ()
#6 0x00000000004087fb in main::to_fixed (compressed=<error reading variable: value requires 211410 bytes, which is more than max-value-size>,
x=<error reading variable: value requires 705600 bytes, which is more than max-value-size>, datamin=0, datamax=1) at intel_fixed_array.f90:91
#7 0x0000000000407b3c in MAIN__.DIR.OMP.PARALLEL.LOOP.2591.split596 () at intel_fixed_array.f90:54
#8 0x00007ffff7563493 in __kmp_invoke_microtask () from /opt/intel/oneapi/compiler/2023.2.1/linux/compiler/lib/intel64_lin/libiomp5.so
#9 0x00007ffff74d1533 in __kmp_invoke_task_func (gtid=-140836864) at ../../src/kmp_runtime.cpp:8273
#10 0x00007ffff74d0470 in __kmp_launch_thread (this_thr=0x7ffff79b0000) at ../../src/kmp_runtime.cpp:6648
#11 0x00007ffff75641ff in _INTERNAL1ebb3278::__kmp_launch_worker (thr=0x7ffff79b0000) at ../../src/z_Linux_util.cpp:559
#12 0x00007ffff7290c64 in start_thread () from /lib64/libc.so.6
#13 0x00007ffff7318550 in clone3 () from /lib64/libc.so.6

0 Kudos
jimdempseyatthecove
Honored Contributor III
929 Views
jvo203
New Contributor I
913 Views

Originally I tried (even twice) to create a new separate post but the automated Intel forum system kept marking new posts as spam. So in desperation I appended this problem to my previous post, also related to "DO CONCURRENT" (a compiler seg. fault during compilation).

0 Kudos
Reply