Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Automatic array in openMP parallel region

Ferdinand_T_
New Contributor II
1,082 Views

Dear all,

today I found that ifort 18 and 19 introduce a bug in my code related to automatic arrays (containing allocatable components) in an openMP parallel region. Are automatic arrays not supposed to work and behave like "private" variables on the run-time stack?

Example:

module m
    implicit none
    type :: element
        integer, allocatable :: x
    end type
contains
    subroutine test(n)
        integer, intent(in) :: n
        type(element), dimension(n) :: t
    end subroutine
end module

program p
    use OMP_LIB
    use m
    implicit none
    integer :: i

    !$OMP PARALLEL DO NUM_THREADS(2)
    do i=1,99999
        call test(2 * omp_get_thread_num() + 1)
    end do
    !$OMP END PARALLEL DO
end program


Compiling:

ifort -qopenmp test.f90


Output: (with -trace -g options)

forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image              PC                Routine            Line        Source             
a.out              0000000000413423  Unknown               Unknown  Unknown
libpthread-2.27.s  00007F5450D3E890  Unknown               Unknown  Unknown
libiomp5.so        00007F545107C72A  Unknown               Unknown  Unknown
libiomp5.so        00007F545107E431  Unknown               Unknown  Unknown
a.out              000000000042678D  Unknown               Unknown  Unknown
a.out              00000000004085A0  Unknown               Unknown  Unknown
a.out              0000000000408BFC  Unknown               Unknown  Unknown
a.out              00000000004032DD  m_mp_test_                 10  test.f90
a.out              0000000000403686  MAIN__                     21  test.f90
libiomp5.so        00007F54510389F3  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        00007F5450FF8F96  Unknown               Unknown  Unknown
libiomp5.so        00007F5450FFA88B  __kmp_fork_call       Unknown  Unknown
libiomp5.so        00007F5450FB9D60  __kmpc_fork_call      Unknown  Unknown
a.out              0000000000403449  MAIN__                     19  test.f90
a.out              0000000000402F42  Unknown               Unknown  Unknown
libc-2.27.so       00007F5450758B97  __libc_start_main     Unknown  Unknown
a.out              0000000000402E2A  Unknown               Unknown  Unknown


Details:

From a parallel construct, a subprogram is called which features an automatic array of derived types with allocatable component.
The size of the automatic array is determined by a dummy argument which is initialized with the result of an expression that differs among threads. The program crashes non-deterministically at run-time; the likelyhood increases with larger automatic array sizes.

The problem appears with all optimization levels -O0 to -O3, and with or without -recursive and -automatic (actually implied by -qopenmp) options. The problem disappears with the -heap-arrays option. [EDIT: -heap-arrays fails with larger n]

The following versions are affected:

ifort -v      _OPENMP     system    crash?
19.0.3.199    201611      I+II      yes
18.0.5        201611      I         yes
17.0.6        201511      I         no
16.0.4        201307      I         no

System I:
OS:  Red Hat Enterprise Linux Server release 7.7 (Maipo)
CPU: Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (2 sockets, 8 cores each)

System II:
OS:  Ubuntu 18.04.3 LTS
CPU: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz (1 socket, 4 cores)


Kind regards
Ferdinand

0 Kudos
10 Replies
jimdempseyatthecove
Honored Contributor III
1,082 Views

The default behavior (IVF 19.0 and earler) was -auto-scalar.

Causes scalar variables of intrinsic types INTEGER, REAL, COMPLEX, and LOGICAL that do not have the SAVE attribute to be allocated to the run-time stack

Arrays and user defined types are not included in that list.

You should use -auto

This option places local variables (scalars and arrays of all types), except those declared as SAVE, on the run-time stack. It is as if the variables were declared with the AUTOMATIC attribute.

Note, while the description also states:

It does not affect variables that have the SAVE attribute or ALLOCATABLE attribute...

That is referring to the allocated data and not the array descriptor. With -auto-scalar, (V19.0 and earlier) the array descriptors were SAVE (meaning shared).

The option -openmp was thought to have implied change from -auto-scalar to -auto, but this was not always the case.

Note, the default behavior changed with the newer language extensions. I am not sure when this occurred.

Try adding -auto as a compiler option.

Jim Dempsey

 

0 Kudos
Ferdinand_T_
New Contributor II
1,082 Views

Jim,

thank you for the explanation. I tried -auto as you suggested, but to no avail (in my original tests, I misspelled -automatic, actually).

What is really confusing me now is that, even though I need the array contents as well as the descriptor to be thread-private, compiling with -save prevents the crash (including when I declare the array "t" explicitly as AUTOMATIC).

Best regards
Ferdinand

0 Kudos
Steve_Lionel
Honored Contributor III
1,082 Views

Enabling OpenMP implies -auto.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,082 Views

Steve>>Enabling OpenMP implies -auto.

On the older compilers (starting with 7.nn), my direct experience was that it was not. I cannot recall when this was corrected. I was able to correct this by attributing with AUTOMATIC. At the time I did not experiment with -auto.

Also, the IVF documentation needs to clarified as to what is and what is not automatic. V19.0 does not clarify this. IOW ambiguous.

Ferdinand>> Do not compile your OpenMP programs with -save, and try not to attribute with AUTOMATIC. The newer compilers should have this corrected. Lack of program crash is not indicative of correctness (not having reentrant calls having independent local UDT and/or arrays).

You need to find the cause for the crash while using -openmp. If you are savvy enough, you can examine the disassembly confirm that the variable t is placed on the stack. Note, the LOC of SomeArray(1) is not the same as the location of the array descriptor. The descriptors need to be on stack for procedures being reentered (when required being private).

One of the old problems with IVF was with user defined type containing allocatables, in our case you have an array of UDTs with allocatables. The problem was with either the initializer (not initializing properly) and/or finalizer (not finalizing properly). Your 19.0.3.199 version should have this old problem fixed, though your test example is a peculiar case where the allocatable is a scalar of INTEGER type (not an array of integers).

As a quick test:

Restore your program to that which fails (post #1) and verify it fails
Then, on line 4 where x is declared, make it x(:)

Then see if crash. If this fixes your real program problem, Then decide how you wish to correct this.

Note, having an allocatable scalar, while legal, is inefficient. The smallest allocation unit from heap is at least two pointer sized words plus the data. There may be a smallest granularity involved. On a 32-bit application, the smallest header is 2x4 bytes, and smallest heap node may be 16 bytes. For 64-bit program 2x8 bytes, smallest heap node may be 32 bytes (16 bytes could theoretically be used for an allocation of size 0).

Jim Dempsey

0 Kudos
Steve_Lionel
Honored Contributor III
1,082 Views

7.nn is before my time. Starting with 8 (2004), that linkage has been there.

0 Kudos
Ferdinand_T_
New Contributor II
1,082 Views

Jim, thanks again:

jimdempseyatthecove (Blackbelt) wrote:

Do not compile your OpenMP programs with -save [...] Lack of program crash is not indicative of correctness [...]

Yes, clearly!

jimdempseyatthecove (Blackbelt) wrote:

If you are savvy enough, you can examine the disassembly [...]

I wish I could...

jimdempseyatthecove (Blackbelt) wrote:

As a quick test:

Restore your program to that which fails (post #1) and verify it fails
Then, on line 4 where x is declared, make it x(:)

Then see if crash.

Crash still occured (with ifort 19.0.3.199) for all optimization levels, albeit at -O0 I had to increase n a little to trigger it.

The only fix I currently can think of is to manually identify all affected automatic arrays in the parallel region and make them ALLOCATABLE. Same for array-valued functions which are also affected.

Thanks for looking into this.

0 Kudos
Ferdinand_T_
New Contributor II
1,082 Views

To add to Jim's suspicion, there really seems to be something messed up with allocatable components in openMP since ifort 18 again.

Below is another example problem, which I managed to isolate and which forces me to enable optimizations (no more debug flags...):

module m
    implicit none
    type :: element
        integer, allocatable :: x
    end type
contains
    subroutine test()
        type(element), allocatable :: a ! or pointer
        allocate(a)
        deallocate(a)
    end subroutine
end module

program p
    use m
    implicit none
    integer :: nn

    !$OMP PARALLEL DO NUM_THREADS(2)
    do nn = 1,999999
        call test()
    end do
    !$OMP END PARALLEL DO
end program

In ifort 16 and 17 (version and system details as in original post) it ran perfectly, but 18 and 19 give me a crash when compiling with -O0:

$ ifort test.f90 -qopenmp -O0 && ./a.out
forrtl: severe (153): allocatable array or pointer is not allocated
Image              PC                Routine            Line        Source             
a.out              0000000000409273  Unknown               Unknown  Unknown
a.out              0000000000403DBE  Unknown               Unknown  Unknown
a.out              000000000040409E  Unknown               Unknown  Unknown
libiomp5.so        00002B77406139F3  __kmp_invoke_micr     Unknown  Unknown
libiomp5.so        00002B77405D3F96  Unknown               Unknown  Unknown
libiomp5.so        00002B77405D588B  __kmp_fork_call       Unknown  Unknown
libiomp5.so        00002B7740594D60  __kmpc_fork_call      Unknown  Unknown
a.out              0000000000403F09  Unknown               Unknown  Unknown
a.out              0000000000403BA2  Unknown               Unknown  Unknown
libc-2.17.so       00002B7740B59545  __libc_start_main     Unknown  Unknown
a.out              0000000000403AA9  Unknown     

With regards
Ferdinand

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,082 Views

>>The only fix I currently can think of is to manually identify all affected automatic arrays in the parallel region and make them ALLOCATABLE

This is not what your problem in #1 is presenting (user defined type with allocatable scalar). If you can, please show a complete procedure that is misbehaving.

Jim Dempsey

0 Kudos
Ferdinand_T_
New Contributor II
1,082 Views

Jim, what I meant is just that, for example, line 09 in the original post (#1) could be replaced by

        type(element), dimension(:), allocatable :: t
        allocate(t(n))

as a workaround.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,082 Views

OK

Note though, while each method should be valid, the (implicit) array element initialization may differ. The former is an in-place (like placement new) and the other is through the post allocate process. While it is expected to result in the same behavior, apparently it does not.

If you feel you have s suitable reproducer, please submit it as a bug report. Include the working way and the misbehaving way together with sufficient information (comments in code) for the support people to observe the different behaviors.

Jim Dempsey

0 Kudos
Reply