Community
cancel
Showing results for 
Search instead for 
Did you mean: 
unrue
Beginner
83 Views

Openmp Sigfault on start

Dear Intel users and developers,

I'm using Intel 15.0.1 on a mixed code Fortran + C. Main is from Fortran side. Serial code works well, but when I enable OpenMP, also with just one thread the code crash immediately with a sigfault and without debug information:

warning: the debug information found in "/developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/impi/5.0.2.044/intel64/lib/libmpi.dbg" does not match "/developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/impi/5.0.2.044/intel64/lib/libmpi_dbg.so.4" (CRC mismatch).

Program received signal SIGSEGV, Segmentation fault.
0x000000000049ddf4 in init_resource ()
Missing separate debuginfos, use: debuginfo-install glib2-2.36.3-5.el7.x86_64 glibc-2.17-55.el7_0.5.x86_64 libgcc-4.8.2-16.el7.x86_64 libstdc++-4.8.2-16.el7.x86_64
(gdb) where
#0  0x000000000049ddf4 in init_resource ()
#1  0x000000000049dd5a in reentrancy_init ()
#2  0x000000000049dc78 in for__reentrancy_init ()
#3  0x00002aaaab8e1055 in for_rtl_init_ () from /gpfs/scratch1/ddp_crs/ddp_install/lib/libddp.so
#4  0x0000000000405b39 in main ()

This is the only information I have, under gdb.  There is a strange MPI warning, (the code doesn't use MPI), I don't know if there is a cause.

I compile as is Fortran and C code: -O0 -g -traceback -fcheck=all

Could you help me to find the problem? I tried also with Valgrind and Totalview but any useful information was retrieved.

Thanks,

 

Tags (1)
0 Kudos
12 Replies
TimP
Black Belt
83 Views

Did you follow any of the advice at

https://software.intel.com/en-us/articles/determining-root-cause-of-sigsegv-or-sigbus-errors

?

Are you certain that your executable (including any libraries you link) doesn't link mpi debug libraries or use coarrays (e.g. what does ldd show)?

If you do call MPI inside OpenMP parallel regions, there are specific requirements in MPI (including, for Intel MPI, explicitly linking libmpi_mt).

unrue
Beginner
83 Views

Hi Tim,

I read that article and tried all suggest, with any result: This is ldd output:

linux-vdso.so.1 =>  (0x00007fffc83fe000)
	libparamsf.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libparamsf.so (0x00002ae0a6842000)
	libf90utils.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libf90utils.so (0x00002ae0a6a4c000)
	liblogging.so => /gpfs/scratch1/ddp_crs/libing_install/lib/liblogging.so (0x00002ae0a6ce3000)
	libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00002ae0a6eeb000)
	libquasirandom_fmp.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libquasirandom_fmp.so (0x00002ae0a7214000)
	libddp.so => /gpfs/scratch1/ddp_crs/ddp_install/lib/libddp.so (0x00002ae0a741c000)
	libdvaparams.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libdvaparams.so (0x00002ae0a76eb000)
	libsegyfile.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libsegyfile.so (0x00002ae0a78f9000)
	libparams.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libparams.so (0x00002ae0a7b05000)
	libpython3.3m.so.1.0 => /developers/devenv/prod/opt/tools/python3/3.3.0/gnu--4.7.2/lib/libpython3.3m.so.1.0 (0x00002ae0a7d0e000)
	libsemblanceOMP.so => /gpfs/scratch1/ddp_crs/crskernel_install/lib/libsemblanceOMP.so (0x00002ae0a819e000)
	libm.so.6 => /lib64/libm.so.6 (0x00002ae0a83a8000)
	libiomp5.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libiomp5.so (0x00002ae0a86aa000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00002ae0a89df000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00002ae0a8bfb000)
	libc.so.6 => /lib64/libc.so.6 (0x00002ae0a8dff000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00002ae0a91c0000)
	libirng.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libirng.so (0x00002ae0a93d6000)
	libcilkrts.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libcilkrts.so.5 (0x00002ae0a95dd000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00002ae0a981c000)
	libmpigf.so.4 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/impi/5.0.2.044/intel64/lib/libmpigf.so.4 (0x00002ae0a9b23000)
	libmpi_dbg.so.4 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/impi/5.0.2.044/intel64/lib/libmpi_dbg.so.4 (0x00002ae0a9dac000)
	librt.so.1 => /lib64/librt.so.1 (0x00002ae0aa839000)
	libifport.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifport.so.5 (0x00002ae0aaa41000)
	libifcore.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifcore.so.5 (0x00002ae0aac6e000)
	libimf.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libimf.so (0x00002ae0aafa3000)
	libsvml.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libsvml.so (0x00002ae0ab45e000)
	libintlc.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libintlc.so.5 (0x00002ae0ac33c000)
	libifcoremt.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifcoremt.so.5 (0x00002ae0ac597000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00002ae0ac8fa000)
	/lib64/ld-linux-x86-64.so.2 (0x00002ae0a661f000)

It seems to link an MPI library, I dont' understand why. The code is not using MPI, just OpenMP. But it fails only with OpenMP, not serial.

jimdempseyatthecove
Black Belt
83 Views

What is libsemblanceOMP.so?

Jim Dempsey

jimdempseyatthecove
Black Belt
83 Views

Also, your library list includes libstdc++.
Therefore, apparently your app is not "Fortran + C", rather it seems to be "Fortran + C++". However, from the information to date, it appears to actually be C++ with Fortran.

"So what!"? (C++ is compatible with OpenMP)

Yes, but...

C++ also contains ctor code for static objects with constructors that will run before main. If one such ctor is calling OpenMP (possibly non-Intel library via libsemblanceOMP) or possibly calling a Cilk++ thread pool creation function (libcilkrts.so), then you may have a situation where you are attempting to construct a two-way or three-way threaded application. It is not advisable to do this due to potentially adverse consequences (though this can be done with care).

What are you not telling us about your application?

Jim Dempsey

 

TimP
Black Belt
83 Views

Not only does your ldd show that you have C++, you have Cilk(tm) Plus (as Jim noted), which might be a problem inside an OpenMP parallel region, and you (or your linked libraries) have used gfortran (not Intel compiler, as you said) to call MPI (apparently also inside the region you want to parallelise).

Along with all that, if you are running this whole thing under python, you have a cascade of untested combinations, and you can't simply add blindly a new layer of OpenMP parallelism in the middle.
 

unrue
Beginner
83 Views

jimdempseyatthecove wrote:

What is libsemblanceOMP.so?

Jim Dempsey

libsemblanceOMP.so is an OpenMP library made by myself and used by the application.

 

unrue
Beginner
83 Views

jimdempseyatthecove wrote:

Also, your library list includes libstdc++.
Therefore, apparently your app is not "Fortran + C", rather it seems to be "Fortran + C++". However, from the information to date, it appears to actually be C++ with Fortran.

"So what!"? (C++ is compatible with OpenMP)

Yes, but...

C++ also contains ctor code for static objects with constructors that will run before main. If one such ctor is calling OpenMP (possibly non-Intel library via libsemblanceOMP) or possibly calling a Cilk++ thread pool creation function (libcilkrts.so), then you may have a situation where you are attempting to construct a two-way or three-way threaded application. It is not advisable to do this due to potentially adverse consequences (though this can be done with care).

What are you not telling us about your application?

Jim Dempsey

 

This application is written by more programmers. We have a new compilation chain in testing, and maybe there are some errors. The application is Fortran and C, is not MPI and doesn't use Cilk. Just serial, OpenMP and GPU. I have found an erroneous call to mpif90 compiler in a linked library  and I fixed it. Maybe  there is an errouneous reference to Cilk somewhere. I tested the code with icc 13.1.3 and ifort 13.1.3 and works well. This is a new ldd oputput:

 

linux-vdso.so.1 =>  (0x00007fff62aff000)
	libparamsf.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libparamsf.so (0x00007f0d0b232000)
	libf90utils.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libf90utils.so (0x00007f0d0afba000)
	liblogging.so => /gpfs/scratch1/ddp_crs/libing_install/lib/liblogging.so (0x00007f0d0adb4000)
	libglib-2.0.so.0 => /lib64/libglib-2.0.so.0 (0x00007f0d0aa7e000)
	libquasirandom_fmp.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libquasirandom_fmp.so (0x00007f0d0a87a000)
	libddp.so => /gpfs/scratch1/ddp_crs/ddp_install/lib/libddp.so (0x00007f0d0a5ac000)
	libdvaparams.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libdvaparams.so (0x00007f0d0a3a3000)
	libsegyfile.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libsegyfile.so (0x00007f0d0a19a000)
	libparams.so => /gpfs/scratch1/ddp_crs/libing_install/lib/libparams.so (0x00007f0d09f94000)
	libpython3.3m.so.1.0 => /developers/devenv/prod/opt/tools/python3/3.3.0/gnu--4.7.2/lib/libpython3.3m.so.1.0 (0x00007f0d09b03000)
	libsemblanceOMP.so => /gpfs/scratch1/ddp_crs/crskernel_install/lib/libsemblanceOMP.so (0x00007f0d098f9000)
	libm.so.6 => /lib64/libm.so.6 (0x00007f0d095f7000)
	libiomp5.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libiomp5.so (0x00007f0d092c1000)
	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f0d090a5000)
	libdl.so.2 => /lib64/libdl.so.2 (0x00007f0d08ea1000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f0d08adf000)
	libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f0d088c9000)
	libirng.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libirng.so (0x00007f0d086c2000)
	libcilkrts.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libcilkrts.so.5 (0x00007f0d08482000)
	libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007f0d0817b000)
	libifport.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifport.so.5 (0x00007f0d07f4e000)
	libifcore.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifcore.so.5 (0x00007f0d07c18000)
	libimf.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libimf.so (0x00007f0d0775d000)
	libsvml.so => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libsvml.so (0x00007f0d0687f000)
	libintlc.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libintlc.so.5 (0x00007f0d06623000)
	libifcoremt.so.5 => /developers/devenv/prod/opt/compilers/intel/cs-xe-2015/none/lib/intel64/libifcoremt.so.5 (0x00007f0d062c0000)
	libutil.so.1 => /lib64/libutil.so.1 (0x00007f0d060bc000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f0d0b43b000)

I wrote OpenMP and GPU version and I'm sure there are not unsafe region. In fact with older Intel version works well.

jimdempseyatthecove
Black Belt
83 Views

?? libsemblanceOMP.so is an OpenMP library made by myself and used by the application.

Do you actually mean you intend to use this as a substitute for Intel's supplied library (libiomp5.so)?

Or do you mean (but miss stated) you made a library using OpenMP?

Jim Dempsey

unrue
Beginner
83 Views

jimdempseyatthecove wrote:

?? libsemblanceOMP.so is an OpenMP library made by myself and used by the application.

Do you actually mean you intend to use this as a substitute for Intel's supplied library (libiomp5.so)?

Or do you mean (but miss stated) you made a library using OpenMP?

Jim Dempsey

Yes, libsemblanceOMP.so is my library that uses OpenMP, it is not a substitute of libiomp5.so

jimdempseyatthecove
Black Belt
83 Views

Can you start you debug session *** at the pre-CRTL initialization point (IOW prior to main), and set a break points at

#0  0x000000000049ddf4 in init_resource ()
#1  0x000000000049dd5a in reentrancy_init ()
#2  0x000000000049dc78 in for__reentrancy_init ()
#3  0x00002aaaab8e1055 in for_rtl_init_ () from /gpfs/scratch1/ddp_crs/ddp_install/lib/libddp.so

Then see if the call stack contains main. If not, then some ctor is initializing the OpenMP thread pool.

Jim Dempsey

unrue
Beginner
83 Views

jimdempseyatthecove wrote:

Can you start you debug session *** at the pre-CRTL initialization point (IOW prior to main), and set a break points at

#0  0x000000000049ddf4 in init_resource ()
#1  0x000000000049dd5a in reentrancy_init ()
#2  0x000000000049dc78 in for__reentrancy_init ()
#3  0x00002aaaab8e1055 in for_rtl_init_ () from /gpfs/scratch1/ddp_crs/ddp_install/lib/libddp.so

Then see if the call stack contains main. If not, then some ctor is initializing the OpenMP thread pool.

Jim Dempsey

Hi Jim, what do you mean with "ctor"? class constructor? Part of the code is in C, not C++. So no class constructor should be included.

And how can I set a breakpoint before main?

EDIT:

By using Totalview, I attach a break before main and sig fault:

main.jpg

sigfault.jpg

Durint Totalview debugging, I saw a routine called "intel_new_feature_proc_init". What is that?

Thanks.

jimdempseyatthecove
Black Belt
83 Views

Read https://software.intel.com/en-us/forums/topic/494782 post #5 (don't know of #7 applies too)

Most debuggers by default initially break at the first statement in main (for C/C++ programs). However, they usually have an alternate method to initially break at the actual program entry point that calls the C RunTime Library init routines, then call other init code (ctor constructors), then finally calls main. Sometimes the technique is obscure until you've done it. For some debuggers after you specify the program and command line arguments, there is a check box or other property page setting that you can set to debug the CRTL init routines. On others, in lieu of clicking on something like "Start Debugging", instead you click on something like "Step Into". Also note, prior do doing this, you may need to open a Disassembly window as many debuggers will not let you step into a function that does not have debug information (it will step over). If you can get the break points set prior to the initial Start Debugging (prior to running the CRTL init routines that call main, and any code it may issue), then you might see if something has inited the (an) OpenMP thread pool context. And this may give you a clue as to what is going on.

You are linking in libcilkrts and libstdc++. These are both C++ libraries. So, either something is using them (and they contain ctors/constuctors), or you are blindly linking in stuff you do not need.

Jim Dempsey

Reply