Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

OpenMP forrtl: severe (174): SIGSEV, segmentation fault occured

Sampson__Andrew
Beginner
1,867 Views

My problem appears to be very similar to what others have found and fixed by increasing the KMP_STACKSIZE/OMP_STACKSIZE variables, but that fix has not worked for me. A couple of notes:

  1. No issues when using gfortran compiler with openmp enabled
  2. No issues when using ifort compiler with openmp enabled and limit to 1 thread
  3. Seg fault occurs when using ifort compiler, version 19.1.1.217 with openmp enabled and threads greater than 1
  4. Seg fault occurs at openmp PARALLEL construct declarations.
  5. Below errors are given with compiled with:

    -g -check all -traceback -fpe0 -mcmodel=medium -debug extended -heap-arrays -p -qopenmp

  6. Also happens with the following complier, optiimized optimized options:

    -O3 -xHost -ipo -qopenmp

Code where this happens:

Code Example.png

It happens at line 200 - This is the first instance of Parallel used in the program. Following some setup, it breaks into a OMP Do that houses the main "meat" of the program.

It runs just fine until it hits line 200 in the code and outputs the following error: 

Error Printout.png

phenx is the name of the code and phenx_v1.0.1.f is the filename the above code is placed within.

I'm using OMP_STACKSIZE = 2G (Single processor requires around 1.2-1.5 G). This is on Ubuntu Server 16.04.

Ideas? I'm out of them. I've encountered similar scenarios before and fixed them the say way I've helped other fix theirs - Increase OMP_STACKSIZE/KMP_STACKSIZE... It's doesn't work. Sometimes, it even give a whole memory backtrace that makes no sense to me (I can provide if needed).

Please forgive me if I don't include all the code specifically - It's 10's of thousands of lines with multiple files... can provide file structure with associated makefile if needed. Please forgive me for the gobbledygook.

Any help is appreciated,

Andrew

0 Kudos
10 Replies
Barbara_P_Intel
Moderator
1,867 Views

Besides setting thread stack with OMP_STACKSIZE, there's an OS limit.

On Linux, try adjusting the shell limit.  The typical OS default is about 8MB for the whole program (shared and local data). To increase this on Linux increase the shell limits. For Bash, use “ulimit -s unlimited”. For csh, use “limit stacksize unlimited.” Unlimiting the stacksize doesn't seem to impact performance.

On Windows, relinking the program with /F:100000000 sets the whole program stack to 100,000,000 bytes, 100MB.

 

0 Kudos
Sampson__Andrew
Beginner
1,867 Views

Thank you for responding. I've forgot to say in my other post that I've done that already. Again, this runs just fine with gfortran - I just can't get it to work with ifort, my preferred complier.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,867 Views

Andrew,

When you post code snippets, please use the {...} code button, with Fortran format selected.

In this manner the readers can copy and paste the text for reply (or testing). The .jpg image is not suitable for copy and paste.

Can you show the declarations (and !dir$...) for each of the COPYIN arguments?

Do you crash if you:

!$omp parallel
!$omp &...
!dir$ if(.false.)
.... your parallel region here
!dir$ endif
!$omp end parallel

If you do not crash, then relocate the !dir$ if(.false.)...!dir$ endif further into your parallel region to try to locate where the crash occurs.

Jim Dempsey

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,867 Views

Also, the __ssse3_rep... functions, in particular memcpy, requires 16 byte alignment. (AVX 32-byte, AVX512 64-byte).


    !DIR$ ATTRIBUTES ALIGN: n:: object

Jim Dempsey

0 Kudos
Sampson__Andrew
Beginner
1,867 Views

Jim,

Thank you for your help. I tried placing the "false if" construct around the code following the parallel declaration and it still failed. It fails at the parallel declaration.

Other things you requested - I'll give the "code" option a try. Thank you for pointing that out to me. 

Variable Declarations for those in the COPYIN

 -----------------------------------------------------------------------------------------------------------------------   
     type doseAccumulators
       sequence
       real(8) :: q
       real(8) :: q2
       real(8) :: qTemp
       integer :: lastHist
       integer :: count
       real(8) :: score
       real(8) :: stdDev
       integer(KIND=4) :: dummy
       integer(KIND=4) :: dummy2
     end type doseAccumulators
     type (doseAccumulators),allocatable,dimension(:) ::
   &       doseAccumPriST1,doseAccumSecST1,doseAccumTotSt1
     type (doseAccumulators),allocatable,dimension(:) ::
    &       imgProjAccumPri,
    &       imgProjAccumSec,
    &       imgProjAccumTot
-----------------------------------------------------------------------------------------------------------------------
      type particleState
        !Position
        real(8) :: x
        real(8) :: y
        real(8) :: z
        !Particle Trajectory
        real(8) :: xTraj
        real(8) :: yTraj
        real(8) :: zTraj
        
        real(8) :: Energy !particle Energy (electron rest mass units - keV/511 keV)
        !AJS 26Aug2019 iEnergy
        integer :: iEnergy !result returned by feeding Energy into findValueIndex
        real(8) :: Weight !particle Weight (initial weight 1.d4)

        integer :: pType  !particle type
      !                    Value 1 :: Photon
      !                    Value 2 :: Electron
      !                    Value 3 :: Neutron
      !                    Value 4 :: Proton
        integer :: collOrder !Order of collision. Value 0 corresponds
      !                       Primary photonxs
        integer :: parentSource !Parent interaction of particle. This basically
      !                         tells me where the particle originated from
      !                         Value 1  :: Primary (All)
      !                         Value 2  :: Characteristic X-ray (Photon)
      !                         Value 3  :: Pair Production (Photon,Positron)
      !                         Value 4  :: Incoherent Scattered Electron (Electron)
      !                         Value 5  :: Photoelectric Electron (Electron)
        integer :: collType !The collision type of the most previous 
      !                         interaction
      !                         Value 1  :: Primary (All)
      !                         Value 2  :: Coherent (Photon)
      !                         Value 3  :: Incoherent (Photon,Electron)
      !                         Value 4  :: Photoelectic (Photon,Electron)
      !                         Value 5  :: Pair Production (Photon,Positron)
        integer :: collObj !The geometric, analytical object within which the
      !                         previous event occured
        integer :: collObjType !Did the previous event happen inside an analytical
      !                         object or a voxel?
      !                         Value 0 - phantom voxel
      !                         value 1 - simulation Universe
      !                         value 2 - analytical object

        !AJS 03Dec2019 !Added vox1DInd
        integer :: vox1DInd !The 1D voxel indicie corresponding to the 
                            ! corresponding voxel tracklength

        integer :: collMedia !In what media did the event occur? This 
      !                       state variable represents the geomMedia
      !                       number of the media
        !AJS 28Nov2017
        real(8) :: energyScoreXRay !The energy deposited from the collision
                             !This variable is filled in particleCurrrent after
                             ! the interaction is processed
        

      end type particleState

      !AJS 07Mar2017
      !I will have one state variable for the "Current Particle" , the one 
      ! being transported. And another one for the particle's state following
      ! the collision. 
      type (particleState) :: particleCurrent
      type (particleState) :: particleAfterColl      

      !AJS 07Mar2017
      ! remove the allocatable array for the stack and replace it with 
      !  the user defined variable
      !I will also have a dimensioned variable for the particleStack
      type (particleState),allocatable,dimension(:) :: particleStack
------------------------------------------------------------------------------------------------------------------
      !voxelIndicies - The voxel indicies that correspond to 
      ! the voxel part of trackLen
      integer,allocatable,dimension(:,:) :: voxelIndicies

      !trackLenNum - The number of tracklengths in trackLen and 
      ! trackMed
      integer :: trackLenNum
------------------------------------------------------------------------------------------------------------------
      type trackLenType
        sequence

        !Variables pertaining to the geometric ray-trace
        real(8) :: trackLen !The distance between each media interface along a 
                            ! particle's projected track through the simulation
                            ! universe.
        real(8) :: dist2Inter !The distance from origin to media interface

        !Variables pertaining the cross sections
        real(8) :: gT_CS  !Total Cross Section (gamma)
        real(8) :: gPE_CS !PhotoElectric Cross Section (gamma)
        real(8) :: gIS_CS !Incoherent Scattering Cross Section (gamma)
        real(8) :: gCS_CS !Coherent Scattering Cross Section (gamma) 
        real(8) :: gPP_CS !Pair Production Cross Section (gamma)
        real(8) :: gCPDF_CS(4) !The CPDF for sampling collision type
                       !Analog Sampling
                          !gCPDF_CS(1) = gPE_CS/gT_CS
                          !gCPDF_CS(2) = (gPE_CS+gIS_CS)/gT_CS
                          !gCPDF_CS(3) = (gPE_CS+gIS_CS+gCS_CS)/gT_CS
                          !gCPDF_CS(4) = (gPE_CS+gIS_CS+gCS_CS+gPP_CS)/gT_CS
        !AJS 03Dec2019
        !Adding in the Mass-Energy Absorption Coefficient
        real(8) :: gMEAC !Mass Energy Absorption Coefficient

        !Variables for inter-collision distance sampling
        real(8) :: muProdL !The product between the total attenuation coefficient
                           ! and the track-length between media intersections.
                           ! This is the value that is sampled when sampling
                           ! intercollision distances.
                           ! It is a cummulative sum of the product
        integer :: anaVox ! value = 0 -> Voxel
                          !       > 0 -> Analytical geometric object 
        !AJS 03Dec2019
        integer :: trackMed !The media corresponding to trackLen
        
        !AJS 03Dec2019
        integer :: vox1DInd !The 1D voxel indicie corresponding to the 
                            ! corresponding voxel tracklength
        !AJS 12Dec2019
        integer :: dummy !Designed to align the fields

        !Variables for Coherent Scatter Modeling
        real(8) :: RASF !Real Anomalous Scatter Factor (Coherent Scatter)
        real(8) :: IASF !Imaginary Anomalous Scatter Factor (Coherent Scatter)
        

      end type tracklenType

      !Now that I have the above variable defined, I need to declare
      ! the variable to be used in the simulation
      type(trackLenType),allocatable,dimension(:) :: trackInfo
-------------------------------------------------------------------------------------------------------------------

      real(8) :: doseConvert !Used for conversion to correct units
      integer(KIND=4),allocatable,dimension(:) :: PRNG_SEED
-------------------------------------------------------------------------------------------------------------------

I hope that helps. All these variables are also declared in modules and included in THREADPRIVATE statements in their associated module.

Thanks again for your help. As I put this together, I noticed that most of the variables were use defined... Maybe that has something to do with it? 

Andrew

PS - I probably "revealed" too much of my own code here... please forgive me. Nothing like opening yourself right up!

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,867 Views

This is your problem:

The OpenMP clause COPYIN(list):

2.19.6.1 copyin Clause
12 Summary
13 The copyin clause provides a mechanism to copy the value of a threadprivate variable of the
14 master thread to the threadprivate variable of each other member of the team that is executing the
15 parallel region.
2.19.2 threadprivate Directive
6 Summary
7 The threadprivate directive specifies that variables are replicated, with each thread having its
8 own copy. The threadprivate directive is a declarative directive.
...
14 The syntax of the threadprivate directive is as follows:
15 !$omp threadprivate(list)
16 where list is a comma-separated list of named variables and named common blocks. Common
17 block names must appear between slashes.

The variables/arrays in your COPYIN(list) are not explicitly declared as !$omp threadprivate(...)
The OpenMP API Specification is very specific in use of "threadprivate".

The Intel compiler (and possibly others) chose to make thread private (space between thread an private) copies (on the stack) for variables/arrays listed in the COPYIN(list) that are not explicitly specified as !$omp threadprivate(...). IOW an implementation preference.

To correct for this:

 type (doseAccumulators),allocatable,dimension(:) ::
&       imgProjAccumPri,
&       imgProjAccumSec,
&       imgProjAccumTot
!$omp threadprivate(
!$omp &       imgProjAccumPri,
!$omp &       imgProjAccumSec,
!$omp &       imgProjAccumTot)
... same with others in your COPYIN(list)

That said, are all/each of these arrays modified within the parallel region?
For those that are not, these should be shared and not in the COPYIN(list)

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,867 Views

One additional note regarding threadprivate.

These variables should be declared in a persistent scope:

COMMON
module
or as SAVE in the scope of a procedure

Jim Dempsey

0 Kudos
Sampson__Andrew
Beginner
1,867 Views

Jim, 

Thank you again for responding and taking the time to look over my code. To answer a few of your questions:

  1. Every variable listed in the COPYIN clause is inside a module - for persistence and portability.
  2. Every variable listed in the COPYIN clause is also listed in a corresponding THREADPRIVATE directive at the end of the associated module.
  3. Every variable listed in the COPYIN clause is unique to the thread and will be altered by the associated thread.

I'm currently working to re-initialize my threadprivate, user-declared variables inside the parallel loop. I believe this to be the cause - I don't think it likes them. I'll report back any progress.

Best Regrads,

Andrew

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,867 Views

One other point/observation.

The COPYIN clause (as implemented in your version of IVF might not perform a realloc lhs should the threadprivate array not be allocated to the size required by the copyin.

In your startup thread where you allocate the initial arrays, encapsulate that with a parallel region.

! initialization in startup thread
allocate(yourThreadPrivateArray(N))

becomes

!$omp parallel
allocate(yourThreadPrivateArray(N))
!$omp end parallel

If that works, you can ask Intel support as to if this is a bug.

Jim Dempsey

0 Kudos
Sampson__Andrew
Beginner
1,867 Views

Thank you for all your help. It was my "voxelIndicies" variable. Took it out of the COPYIN, allocated, and Initialized it inside the parallel region and it is now working.

 

0 Kudos
Reply