- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
today I found that ifort 18 and 19 introduce a bug in my code related to automatic arrays (containing allocatable components) in an openMP parallel region. Are automatic arrays not supposed to work and behave like "private" variables on the run-time stack?
Example:
module m implicit none type :: element integer, allocatable :: x end type contains subroutine test(n) integer, intent(in) :: n type(element), dimension(n) :: t end subroutine end module program p use OMP_LIB use m implicit none integer :: i !$OMP PARALLEL DO NUM_THREADS(2) do i=1,99999 call test(2 * omp_get_thread_num() + 1) end do !$OMP END PARALLEL DO end program
Compiling:
ifort -qopenmp test.f90
Output: (with -trace -g options)
forrtl: severe (174): SIGSEGV, segmentation fault occurred Image PC Routine Line Source a.out 0000000000413423 Unknown Unknown Unknown libpthread-2.27.s 00007F5450D3E890 Unknown Unknown Unknown libiomp5.so 00007F545107C72A Unknown Unknown Unknown libiomp5.so 00007F545107E431 Unknown Unknown Unknown a.out 000000000042678D Unknown Unknown Unknown a.out 00000000004085A0 Unknown Unknown Unknown a.out 0000000000408BFC Unknown Unknown Unknown a.out 00000000004032DD m_mp_test_ 10 test.f90 a.out 0000000000403686 MAIN__ 21 test.f90 libiomp5.so 00007F54510389F3 __kmp_invoke_micr Unknown Unknown libiomp5.so 00007F5450FF8F96 Unknown Unknown Unknown libiomp5.so 00007F5450FFA88B __kmp_fork_call Unknown Unknown libiomp5.so 00007F5450FB9D60 __kmpc_fork_call Unknown Unknown a.out 0000000000403449 MAIN__ 19 test.f90 a.out 0000000000402F42 Unknown Unknown Unknown libc-2.27.so 00007F5450758B97 __libc_start_main Unknown Unknown a.out 0000000000402E2A Unknown Unknown Unknown
Details:
From a parallel construct, a subprogram is called which features an automatic array of derived types with allocatable component.
The size of the automatic array is determined by a dummy argument which is initialized with the result of an expression that differs among threads. The program crashes non-deterministically at run-time; the likelyhood increases with larger automatic array sizes.
The problem appears with all optimization levels -O0 to -O3, and with or without -recursive and -automatic (actually implied by -qopenmp) options. The problem disappears with the -heap-arrays option. [EDIT: -heap-arrays fails with larger n]
The following versions are affected:
ifort -v _OPENMP system crash? 19.0.3.199 201611 I+II yes 18.0.5 201611 I yes 17.0.6 201511 I no 16.0.4 201307 I no
System I:
OS: Red Hat Enterprise Linux Server release 7.7 (Maipo)
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (2 sockets, 8 cores each)
System II:
OS: Ubuntu 18.04.3 LTS
CPU: Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz (1 socket, 4 cores)
Kind regards
Ferdinand
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The default behavior (IVF 19.0 and earler) was -auto-scalar.
Causes scalar variables of intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL that do not have the SAVE attribute to be allocated to the run-time stack
Arrays and user defined types are not included in that list.
You should use -auto
This option places local variables (scalars and arrays of all types), except those declared as SAVE, on the run-time stack. It is as if the variables were declared with the AUTOMATIC attribute.
Note, while the description also states:
It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute...
That is referring to the allocated data and not the array descriptor. With -auto-scalar, (V19.0 and earlier) the array descriptors were SAVE (meaning shared).
The option -openmp was thought to have implied change from -auto-scalar to -auto, but this was not always the case.
Note, the default behavior changed with the newer language extensions. I am not sure when this occurred.
Try adding -auto as a compiler option.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim,
thank you for the explanation. I tried -auto as you suggested, but to no avail (in my original tests, I misspelled -automatic, actually).
What is really confusing me now is that, even though I need the array contents as well as the descriptor to be thread-private, compiling with -save prevents the crash (including when I declare the array "t" explicitly as AUTOMATIC).
Best regards
Ferdinand
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Enabling OpenMP implies -auto.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve>>Enabling OpenMP implies -auto.
On the older compilers (starting with 7.nn), my direct experience was that it was not. I cannot recall when this was corrected. I was able to correct this by attributing with AUTOMATIC. At the time I did not experiment with -auto.
Also, the IVF documentation needs to clarified as to what is and what is not automatic. V19.0 does not clarify this. IOW ambiguous.
Ferdinand>> Do not compile your OpenMP programs with -save, and try not to attribute with AUTOMATIC. The newer compilers should have this corrected. Lack of program crash is not indicative of correctness (not having reentrant calls having independent local UDT and/or arrays).
You need to find the cause for the crash while using -openmp. If you are savvy enough, you can examine the disassembly confirm that the variable t is placed on the stack. Note, the LOC of SomeArray(1) is not the same as the location of the array descriptor. The descriptors need to be on stack for procedures being reentered (when required being private).
One of the old problems with IVF was with user defined type containing allocatables, in our case you have an array of UDTs with allocatables. The problem was with either the initializer (not initializing properly) and/or finalizer (not finalizing properly). Your 19.0.3.199 version should have this old problem fixed, though your test example is a peculiar case where the allocatable is a scalar of INTEGER type (not an array of integers).
As a quick test:
Restore your program to that which fails (post #1) and verify it fails
Then, on line 4 where x is declared, make it x(:)
Then see if crash. If this fixes your real program problem, Then decide how you wish to correct this.
Note, having an allocatable scalar, while legal, is inefficient. The smallest allocation unit from heap is at least two pointer sized words plus the data. There may be a smallest granularity involved. On a 32-bit application, the smallest header is 2x4 bytes, and smallest heap node may be 16 bytes. For 64-bit program 2x8 bytes, smallest heap node may be 32 bytes (16 bytes could theoretically be used for an allocation of size 0).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
7.nn is before my time. Starting with 8 (2004), that linkage has been there.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, thanks again:
jimdempseyatthecove (Blackbelt) wrote:Do not compile your OpenMP programs with -save [...] Lack of program crash is not indicative of correctness [...]
Yes, clearly!
jimdempseyatthecove (Blackbelt) wrote:If you are savvy enough, you can examine the disassembly [...]
I wish I could...
jimdempseyatthecove (Blackbelt) wrote:As a quick test:
Restore your program to that which fails (post #1) and verify it fails
Then, on line 4 where x is declared, make it x(:)Then see if crash.
Crash still occured (with ifort 19.0.3.199) for all optimization levels, albeit at -O0 I had to increase n a little to trigger it.
The only fix I currently can think of is to manually identify all affected automatic arrays in the parallel region and make them ALLOCATABLE. Same for array-valued functions which are also affected.
Thanks for looking into this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To add to Jim's suspicion, there really seems to be something messed up with allocatable components in openMP since ifort 18 again.
Below is another example problem, which I managed to isolate and which forces me to enable optimizations (no more debug flags...):
module m implicit none type :: element integer, allocatable :: x end type contains subroutine test() type(element), allocatable :: a ! or pointer allocate(a) deallocate(a) end subroutine end module program p use m implicit none integer :: nn !$OMP PARALLEL DO NUM_THREADS(2) do nn = 1,999999 call test() end do !$OMP END PARALLEL DO end program
In ifort 16 and 17 (version and system details as in original post) it ran perfectly, but 18 and 19 give me a crash when compiling with -O0:
$ ifort test.f90 -qopenmp -O0 && ./a.out forrtl: severe (153): allocatable array or pointer is not allocated Image PC Routine Line Source a.out 0000000000409273 Unknown Unknown Unknown a.out 0000000000403DBE Unknown Unknown Unknown a.out 000000000040409E Unknown Unknown Unknown libiomp5.so 00002B77406139F3 __kmp_invoke_micr Unknown Unknown libiomp5.so 00002B77405D3F96 Unknown Unknown Unknown libiomp5.so 00002B77405D588B __kmp_fork_call Unknown Unknown libiomp5.so 00002B7740594D60 __kmpc_fork_call Unknown Unknown a.out 0000000000403F09 Unknown Unknown Unknown a.out 0000000000403BA2 Unknown Unknown Unknown libc-2.17.so 00002B7740B59545 __libc_start_main Unknown Unknown a.out 0000000000403AA9 Unknown
With regards
Ferdinand
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>The only fix I currently can think of is to manually identify all affected automatic arrays in the parallel region and make them ALLOCATABLE
This is not what your problem in #1 is presenting (user defined type with allocatable scalar). If you can, please show a complete procedure that is misbehaving.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, what I meant is just that, for example, line 09 in the original post (#1) could be replaced by
type(element), dimension(:), allocatable :: t allocate(t(n))
as a workaround.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK
Note though, while each method should be valid, the (implicit) array element initialization may differ. The former is an in-place (like placement new) and the other is through the post allocate process. While it is expected to result in the same behavior, apparently it does not.
If you feel you have s suitable reproducer, please submit it as a bug report. Include the working way and the misbehaving way together with sufficient information (comments in code) for the support people to observe the different behaviors.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page