- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My problem appears to be very similar to what others have found and fixed by increasing the KMP_STACKSIZE/OMP_STACKSIZE variables, but that fix has not worked for me. A couple of notes:
- No issues when using gfortran compiler with openmp enabled
- No issues when using ifort compiler with openmp enabled and limit to 1 thread
- Seg fault occurs when using ifort compiler, version 19.1.1.217 with openmp enabled and threads greater than 1
- Seg fault occurs at openmp PARALLEL construct declarations.
- Below errors are given with compiled with:
-g -check all -traceback -fpe0 -mcmodel=medium -debug extended -heap-arrays -p -qopenmp
- Also happens with the following complier, optiimized optimized options:
-O3 -xHost -ipo -qopenmp
Code where this happens:
It happens at line 200 - This is the first instance of Parallel used in the program. Following some setup, it breaks into a OMP Do that houses the main "meat" of the program.
It runs just fine until it hits line 200 in the code and outputs the following error:
phenx is the name of the code and phenx_v1.0.1.f is the filename the above code is placed within.
I'm using OMP_STACKSIZE = 2G (Single processor requires around 1.2-1.5 G). This is on Ubuntu Server 16.04.
Ideas? I'm out of them. I've encountered similar scenarios before and fixed them the say way I've helped other fix theirs - Increase OMP_STACKSIZE/KMP_STACKSIZE... It's doesn't work. Sometimes, it even give a whole memory backtrace that makes no sense to me (I can provide if needed).
Please forgive me if I don't include all the code specifically - It's 10's of thousands of lines with multiple files... can provide file structure with associated makefile if needed. Please forgive me for the gobbledygook.
Any help is appreciated,
Andrew
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Besides setting thread stack with OMP_STACKSIZE, there's an OS limit.
On Linux, try adjusting the shell limit. The typical OS default is about 8MB for the whole program (shared and local data). To increase this on Linux increase the shell limits. For Bash, use “ulimit -s unlimited”. For csh, use “limit stacksize unlimited.” Unlimiting the stacksize doesn't seem to impact performance.
On Windows, relinking the program with /F:100000000 sets the whole program stack to 100,000,000 bytes, 100MB.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for responding. I've forgot to say in my other post that I've done that already. Again, this runs just fine with gfortran - I just can't get it to work with ifort, my preferred complier.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Andrew,
When you post code snippets, please use the {...} code button, with Fortran format selected.
In this manner the readers can copy and paste the text for reply (or testing). The .jpg image is not suitable for copy and paste.
Can you show the declarations (and !dir$...) for each of the COPYIN arguments?
Do you crash if you:
!$omp parallel
!$omp &...
!dir$ if(.false.)
.... your parallel region here
!dir$ endif
!$omp end parallel
If you do not crash, then relocate the !dir$ if(.false.)...!dir$ endif further into your parallel region to try to locate where the crash occurs.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, the __ssse3_rep... functions, in particular memcpy, requires 16 byte alignment. (AVX 32-byte, AVX512 64-byte).
!DIR$ ATTRIBUTES ALIGN: n:: object
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim,
Thank you for your help. I tried placing the "false if" construct around the code following the parallel declaration and it still failed. It fails at the parallel declaration.
Other things you requested - I'll give the "code" option a try. Thank you for pointing that out to me.
Variable Declarations for those in the COPYIN
----------------------------------------------------------------------------------------------------------------------- type doseAccumulators sequence real(8) :: q real(8) :: q2 real(8) :: qTemp integer :: lastHist integer :: count real(8) :: score real(8) :: stdDev integer(KIND=4) :: dummy integer(KIND=4) :: dummy2 end type doseAccumulators type (doseAccumulators),allocatable,dimension(:) :: & doseAccumPriST1,doseAccumSecST1,doseAccumTotSt1
type (doseAccumulators),allocatable,dimension(:) :: & imgProjAccumPri, & imgProjAccumSec, & imgProjAccumTot
-----------------------------------------------------------------------------------------------------------------------
type particleState !Position real(8) :: x real(8) :: y real(8) :: z !Particle Trajectory real(8) :: xTraj real(8) :: yTraj real(8) :: zTraj real(8) :: Energy !particle Energy (electron rest mass units - keV/511 keV) !AJS 26Aug2019 iEnergy integer :: iEnergy !result returned by feeding Energy into findValueIndex real(8) :: Weight !particle Weight (initial weight 1.d4) integer :: pType !particle type ! Value 1 :: Photon ! Value 2 :: Electron ! Value 3 :: Neutron ! Value 4 :: Proton integer :: collOrder !Order of collision. Value 0 corresponds ! Primary photonxs integer :: parentSource !Parent interaction of particle. This basically ! tells me where the particle originated from ! Value 1 :: Primary (All) ! Value 2 :: Characteristic X-ray (Photon) ! Value 3 :: Pair Production (Photon,Positron) ! Value 4 :: Incoherent Scattered Electron (Electron) ! Value 5 :: Photoelectric Electron (Electron) integer :: collType !The collision type of the most previous ! interaction ! Value 1 :: Primary (All) ! Value 2 :: Coherent (Photon) ! Value 3 :: Incoherent (Photon,Electron) ! Value 4 :: Photoelectic (Photon,Electron) ! Value 5 :: Pair Production (Photon,Positron) integer :: collObj !The geometric, analytical object within which the ! previous event occured integer :: collObjType !Did the previous event happen inside an analytical ! object or a voxel? ! Value 0 - phantom voxel ! value 1 - simulation Universe ! value 2 - analytical object !AJS 03Dec2019 !Added vox1DInd integer :: vox1DInd !The 1D voxel indicie corresponding to the ! corresponding voxel tracklength integer :: collMedia !In what media did the event occur? This ! state variable represents the geomMedia ! number of the media !AJS 28Nov2017 real(8) :: energyScoreXRay !The energy deposited from the collision !This variable is filled in particleCurrrent after ! the interaction is processed end type particleState !AJS 07Mar2017 !I will have one state variable for the "Current Particle" , the one ! being transported. And another one for the particle's state following ! the collision. type (particleState) :: particleCurrent type (particleState) :: particleAfterColl !AJS 07Mar2017 ! remove the allocatable array for the stack and replace it with ! the user defined variable !I will also have a dimensioned variable for the particleStack type (particleState),allocatable,dimension(:) :: particleStack
------------------------------------------------------------------------------------------------------------------ !voxelIndicies - The voxel indicies that correspond to ! the voxel part of trackLen integer,allocatable,dimension(:,:) :: voxelIndicies !trackLenNum - The number of tracklengths in trackLen and ! trackMed integer :: trackLenNum ------------------------------------------------------------------------------------------------------------------ type trackLenType sequence !Variables pertaining to the geometric ray-trace real(8) :: trackLen !The distance between each media interface along a ! particle's projected track through the simulation ! universe. real(8) :: dist2Inter !The distance from origin to media interface !Variables pertaining the cross sections real(8) :: gT_CS !Total Cross Section (gamma) real(8) :: gPE_CS !PhotoElectric Cross Section (gamma) real(8) :: gIS_CS !Incoherent Scattering Cross Section (gamma) real(8) :: gCS_CS !Coherent Scattering Cross Section (gamma) real(8) :: gPP_CS !Pair Production Cross Section (gamma) real(8) :: gCPDF_CS(4) !The CPDF for sampling collision type !Analog Sampling !gCPDF_CS(1) = gPE_CS/gT_CS !gCPDF_CS(2) = (gPE_CS+gIS_CS)/gT_CS !gCPDF_CS(3) = (gPE_CS+gIS_CS+gCS_CS)/gT_CS !gCPDF_CS(4) = (gPE_CS+gIS_CS+gCS_CS+gPP_CS)/gT_CS !AJS 03Dec2019 !Adding in the Mass-Energy Absorption Coefficient real(8) :: gMEAC !Mass Energy Absorption Coefficient !Variables for inter-collision distance sampling real(8) :: muProdL !The product between the total attenuation coefficient ! and the track-length between media intersections. ! This is the value that is sampled when sampling ! intercollision distances. ! It is a cummulative sum of the product integer :: anaVox ! value = 0 -> Voxel ! > 0 -> Analytical geometric object !AJS 03Dec2019 integer :: trackMed !The media corresponding to trackLen !AJS 03Dec2019 integer :: vox1DInd !The 1D voxel indicie corresponding to the ! corresponding voxel tracklength !AJS 12Dec2019 integer :: dummy !Designed to align the fields !Variables for Coherent Scatter Modeling real(8) :: RASF !Real Anomalous Scatter Factor (Coherent Scatter) real(8) :: IASF !Imaginary Anomalous Scatter Factor (Coherent Scatter) end type tracklenType !Now that I have the above variable defined, I need to declare ! the variable to be used in the simulation type(trackLenType),allocatable,dimension(:) :: trackInfo ------------------------------------------------------------------------------------------------------------------- real(8) :: doseConvert !Used for conversion to correct units integer(KIND=4),allocatable,dimension(:) :: PRNG_SEED -------------------------------------------------------------------------------------------------------------------
I hope that helps. All these variables are also declared in modules and included in THREADPRIVATE statements in their associated module.
Thanks again for your help. As I put this together, I noticed that most of the variables were use defined... Maybe that has something to do with it?
Andrew
PS - I probably "revealed" too much of my own code here... please forgive me. Nothing like opening yourself right up!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is your problem:
The OpenMP clause COPYIN(list):
2.19.6.1 copyin Clause 12 Summary 13 The copyin clause provides a mechanism to copy the value of a threadprivate variable of the 14 master thread to the threadprivate variable of each other member of the team that is executing the 15 parallel region.
2.19.2 threadprivate Directive 6 Summary 7 The threadprivate directive specifies that variables are replicated, with each thread having its 8 own copy. The threadprivate directive is a declarative directive. ... 14 The syntax of the threadprivate directive is as follows: 15 !$omp threadprivate(list) 16 where list is a comma-separated list of named variables and named common blocks. Common 17 block names must appear between slashes.
The variables/arrays in your COPYIN(list) are not explicitly declared as !$omp threadprivate(...)
The OpenMP API Specification is very specific in use of "threadprivate".
The Intel compiler (and possibly others) chose to make thread private (space between thread an private) copies (on the stack) for variables/arrays listed in the COPYIN(list) that are not explicitly specified as !$omp threadprivate(...). IOW an implementation preference.
To correct for this:
type (doseAccumulators),allocatable,dimension(:) :: & imgProjAccumPri, & imgProjAccumSec, & imgProjAccumTot !$omp threadprivate( !$omp & imgProjAccumPri, !$omp & imgProjAccumSec, !$omp & imgProjAccumTot) ... same with others in your COPYIN(list)
That said, are all/each of these arrays modified within the parallel region?
For those that are not, these should be shared and not in the COPYIN(list)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One additional note regarding threadprivate.
These variables should be declared in a persistent scope:
COMMON
module
or as SAVE in the scope of a procedure
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim,
Thank you again for responding and taking the time to look over my code. To answer a few of your questions:
- Every variable listed in the COPYIN clause is inside a module - for persistence and portability.
- Every variable listed in the COPYIN clause is also listed in a corresponding THREADPRIVATE directive at the end of the associated module.
- Every variable listed in the COPYIN clause is unique to the thread and will be altered by the associated thread.
I'm currently working to re-initialize my threadprivate, user-declared variables inside the parallel loop. I believe this to be the cause - I don't think it likes them. I'll report back any progress.
Best Regrads,
Andrew
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
One other point/observation.
The COPYIN clause (as implemented in your version of IVF might not perform a realloc lhs should the threadprivate array not be allocated to the size required by the copyin.
In your startup thread where you allocate the initial arrays, encapsulate that with a parallel region.
! initialization in startup thread allocate(yourThreadPrivateArray(N)) becomes !$omp parallel allocate(yourThreadPrivateArray(N)) !$omp end parallel
If that works, you can ask Intel support as to if this is a bug.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for all your help. It was my "voxelIndicies" variable. Took it out of the COPYIN, allocated, and Initialized it inside the parallel region and it is now working.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page