Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28538 Discussions

Segmentation Fault in OpenMP DAG Scheduling using ifort/ifx

grisuthedragon
407 Views

I am developing numerical linear algebra algorithms on top of OpenMP with DAG scheduling. The algorithms making extensive use of task-dependencies.

Executing the code for many times inside a benchmark, the code run into a segmentation fault. The segmentation fault happens reproducible at the same position in the code and the same routine inside the OpenMP runtime library. In the 20th call to the routine, I obtain everytime:

 

 

 

 

Thread 1 "benchmark_tgsyl" received signal SIGSEGV, Segmentation fault.
0x00007fffeeed43cb in std::__atomic_base<int>::fetch_add (this=<optimized out>, __i=<optimized out>, __m=<optimized out>) at /usr/include/c++/4.8.2/bits/atomic_base.h:614
614     /usr/include/c++/4.8.2/bits/atomic_base.h: No such file or directory.
(gdb) bt
#0  0x00007fffeeed43cb in std::__atomic_base<int>::fetch_add (this=<optimized out>, __i=<optimized out>, __m=<optimized out>) at /usr/include/c++/4.8.2/bits/atomic_base.h:614
#1  _INTERNAL99002fb4::__kmp_node_ref (node=<optimized out>) at ../../src/kmp_taskdeps.cpp:80
#2  _INTERNAL99002fb4::__kmp_dephash_find (thread=<optimized out>, hash=<optimized out>, addr=<optimized out>) at ../../src/kmp_taskdeps.cpp:210
#3  _INTERNAL99002fb4::__kmp_process_deps<true> (gtid=1, node=0x7fffe865f200, hash=0x197, dep_barrier=128, ndeps=774093789, dep_list=0x84a380, task=0x827b40) at ../../src/kmp_taskdeps.cpp:401
#4  0x00007fffeeed5d1f in __kmpc_omp_task_with_deps (warning: Could not recognize version of Intel Compiler in: "Intel(R) Fortran 23.0-1198"
loc_ref=0x1, gtid=-395972096, new_task=0x197, ndeps=8692608, dep_list=0x402f5a332e23bbdd, ndeps_noalias=8692608, noalias_dep_list=0x0)
    at ../../src/kmp_taskdeps.cpp:566
#5  0x00007fffed332a92 in dla_tgsylv_dag_.DIR.OMP.PARALLEL.42.split () at /home/grisu/work/software/mepack/src/double/dag/dla_tgsylv_dag.f90:436
#6  0x00007fffeef38053 in __kmp_invoke_microtask () from /scratch/grisu/intel/oneapi/compiler/2023.0.0/linux/compiler/lib/intel64_lin/libiomp5.so
#7  0x00007fffeeea62f3 in __kmp_invoke_task_func (gtid=1) at ../../src/kmp_runtime.cpp:7845
#8  0x00007fffeeea7578 in __kmp_fork_call (loc=0x1, gtid=-395972096, call_context=(fork_context_intel | fork_context_last | unknown: 404), argc=8692608, microtask=0x402f5a332e23bbdd, invoker=0x84a380, 
    ap=0x7fffffff7030) at ../../src/kmp_runtime.cpp:2508
#9  0x00007fffeee60223 in __kmpc_fork_call (loc=0x1, argc=-395972096, microtask=0x197) at ../../src/kmp_csupport.cpp:350
#10 0x00007fffed2fa2c5 in dla_tgsylv_dag (transa=..., transb=..., sgn=1, m=255, n=96, a=..., lda=527, b=..., ldb=607, c=..., ldc=527, d=..., ldd=607, 
    x=<error reading variable: value requires 1253031424 bytes, which is more than max-value-size>, ldx=527, scale=1, work=..., info=0, _transa=1, _transb=1)
    at /home/grisu/work/software/mepack/src/double/dag/dla_tgsylv_dag.f90:371
#11 0x00007fffed2c4fe8 in dla_tgsylv_dag_.t1161p.t1162p.t1163p.t1164p.t1165p.t1166p.t1167p.t1168p.t1169p.t1170p.t1171p.t1172p.t1173p.t1174p.t1175p.t1176p.t1177p.t1178p.t3v.t3v ()
   from /home/grisu/work/software/mepack/build-intel-debug/src/libmepack.so.1
#12 0x00007fffed2b849b in dla_tgsylv_l3_2s (transa=..., transb=..., sgn=1, m=527, n=607, a=..., lda=527, b=..., ldb=607, c=..., ldc=527, d=..., ldd=607, x=..., ldx=527, scale=1, work=..., info=0, 
    _transa=140733193388033, _transb=1) at /home/grisu/work/software/mepack/src/double/level3/dla_tgsylv_l3_2stage.f90:410
#13 0x00007fffee8b4436 in mepack_double_tgsylv_level3_2stage (TRANSA=0x7fffffffb2f0 "N", TRANSB=0x7fffffffb2d0 "N", SGN=1, M=527, N=607, A=0x7fffe95f4010, LDA=527, B=0x7fffe9105010, LDB=607, 
    C=0x7fffe93d5010, LDC=527, D=0x7fffe8e35010, LDD=607, X=0x7fffe8bc4010, LDX=527, SCALE=0x7fffffffb2c8, WORK=0x7fffe7d5a010, INFO=0x7fffffffb2c4)
    at /home/grisu/work/software/mepack/src/c/level3/tgsylv.c:794
#14 0x0000000000409cef in main (argc=12, argv=0x7fffffffb508) at /home/grisu/work/software/mepack/examples/triangular/benchmark_tgsylv.c:552

 

Using valgrind I found out that this seems to be the fault of an missing initialization:

==250404== Conditional jump or move depends on uninitialised value(s)                         
==250404==    at 0xD6EF3B3: __kmp_dephash_find (kmp_taskdeps.cpp:207)                                                                                                                                      
==250404==    by 0xD6EF3B3: int _INTERNAL99002fb4::__kmp_process_deps<true>(int, kmp_depnode*, kmp_dephash**, bool, int, kmp_depend_info*, kmp_task*) (kmp_taskdeps.cpp:401)
==250404==    by 0xD6F0D1E: __kmp_check_deps (kmp_taskdeps.cpp:566)         
==250404==    by 0xD6F0D1E: __kmpc_omp_task_with_deps (kmp_taskdeps.cpp:709)
==250404==    by 0xEF734B6: dla_tgsylv_dag_.DIR.OMP.PARALLEL.42.split (dla_tgsylv_dag.f90:422)
==250404==    by 0xD753052: __kmp_invoke_microtask (in /scratch/koehlerm/intel/oneapi/compiler/2023.0.0/linux/compiler/lib/intel64_lin/libiomp5.so)
==250404==    by 0xD6C12F2: __kmp_invoke_task_func (kmp_runtime.cpp:7845)
==250404==    by 0xD6C2577: __kmp_fork_call (kmp_runtime.cpp:2508)
==250404==    by 0xD67B222: __kmpc_fork_call (kmp_csupport.cpp:350)
==250404==    by 0xEF461FC: dla_tgsylv_dag_ (dla_tgsylv_dag.f90:371)
==250404==    by 0xEF10FE7: dla_tgsylv_dag_.t1161p.t1162p.t1163p.t1164p.t1165p.t1166p.t1167p.t1168p.t1169p.t1170p.t1171p.t1172p.t1173p.t1174p.t1175p.t1176p.t1177p.t1178p.t3v.t3v (dla_tgsylv_l3_2stage.f90
:0)
==250404==    by 0xEF0449A: dla_tgsylv_l3_2s_ (dla_tgsylv_l3_2stage.f90:410)
==250404==    by 0x104F5DC5: mepack_double_tgsylv_level3_2stage (src/c/level3/tgsylv.c:794)
==250404==    by 0x409D0C: main (examples/triangular/benchmark_tgsylv.c:554)
==250404==  Uninitialised value was created by a heap allocation
==250404==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==250404==    by 0xD64B1DD: _INTERNAL246cd64f::bget(kmp_info*, long) (kmp_alloc.cpp:654)
==250404==    by 0xD64AED6: ___kmp_fast_allocate (kmp_alloc.cpp:2256)
==250404==    by 0xD7002E0: __kmp_task_alloc_impl (kmp_tasking.cpp:1510)
==250404==    by 0xD7002E0: __kmpc_omp_task_alloc (kmp_tasking.cpp:1650)
==250404==    by 0xEF749BE: dla_tgsylv_dag_.DIR.OMP.PARALLEL.42.split (dla_tgsylv_dag.f90:443)
==250404==    by 0xD753052: __kmp_invoke_microtask (in /scratch/koehlerm/intel/oneapi/compiler/2023.0.0/linux/compiler/lib/intel64_lin/libiomp5.so)
==250404==    by 0xD6C12F2: __kmp_invoke_task_func (kmp_runtime.cpp:7845)
==250404==    by 0xD6C2577: __kmp_fork_call (kmp_runtime.cpp:2508)
==250404==    by 0xD67B222: __kmpc_fork_call (kmp_csupport.cpp:350)
==250404==    by 0xEF461FC: dla_tgsylv_dag_ (dla_tgsylv_dag.f90:371)
==250404==    by 0xEF10FE7: dla_tgsylv_dag_.t1161p.t1162p.t1163p.t1164p.t1165p.t1166p.t1167p.t1168p.t1169p.t1170p.t1171p.t1172p.t1173p.t1174p.t1175p.t1176p.t1177p.t1178p.t3v.t3v (dla_tgsylv_l3_2stage.f90
:0)
==250404==    by 0xEF0449A: dla_tgsylv_l3_2s_ (dla_tgsylv_l3_2stage.f90:410)

 

 

The error is invariant under ifort(2021.8.0) or ifx(2023.0.0). Running the code with gfortran and libgomp everything works fine. The OpenMP info gives:

 

 

 

 

Intel(R) OMP Copyright (C) 1997-2022, Intel Corporation. All Rights Reserved.
Intel(R) OMP version: 5.0.20221004
Intel(R) OMP library type: performance
Intel(R) OMP link type: dynamic
Intel(R) OMP build time: 2022-10-05 18:33:30 UTC
Intel(R) OMP build compiler: Intel(R) C++ Compiler 19.1
Intel(R) OMP alternative compiler support: yes
Intel(R) OMP API version: 5.0 (201611)
Intel(R) OMP dynamic error checking: no
Intel(R) OMP thread affinity support: not used

 

 

 

 

 

The code runs under Ubuntu 20.04. Using Inspector, there is no memleak caused by our code.

0 Kudos
0 Replies
Reply