Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29280 Discussions

Why oh why does my OMP program die?

IanH
Honored Contributor III
1,119 Views
I've been experimenting with OMP lately, however recent efforts have hit a bit of a brick wall and I would appreciate any advice.

The following reduced example crashes with an access violation when compiled with 11.1.38 with /check:all /warn:all /Qopenmp. It strikes me as strange, because the program doesn't actually do very much. What have I missed?

(It works "as expected" using 10.1 and 11.0, or with no /check:all, or if I compile while facing north, or...)

Thanks,

IanH

[cpp]MODULE AMod
  IMPLICIT NONE
  INTEGER, PARAMETER :: na = 10
CONTAINS 
  SUBROUTINE ASub(n1, n2, n3, n4, array_in)
    INTEGER, INTENT(IN) :: n1, n2, n3, n4
    REAL, INTENT(IN) :: array_in(na,n4,n3,n2,n1)
    ! Structure indices          
    INTEGER i1,i2,i3,i4    
    REAL slice(na)
    !***************************************************************************
    write (*,*) 'start'
    !$OMP PARALLEL DO NUM_THREADS(1), DEFAULT(NONE),  &
    !$OMP     PRIVATE(i2,i3,i4,slice),  &
    !$OMP     SHARED(n1, n2, n3, n4, array_in)
    Loop1: DO i1 = 1, n1
      Loop2: DO i2 = 1, n2
        Loop3: DO i3 = 1, n3
          Loop4: DO i4 = 1, n4
            slice = array_in(:,i4,i3,i2,i1)
            CALL BSub(slice, array_in(:,1,1,1,i1))
          END DO Loop4
        END DO Loop3
      END DO Loop2
    END DO Loop1
    WRITE (*,*) 'finish'
  END SUBROUTINE ASub
  
  SUBROUTINE BSub(array1, array2)
    ! Arguments
    REAL, INTENT(IN) :: array1(:)   
    REAL, INTENT(IN) :: array2(:) 
    !***************************************************************************
  END SUBROUTINE BSub
END MODULE AMod

PROGRAM omp
  USE AMod
  IMPLICIT NONE
  INTEGER n1, n2, n3, n4
  REAL, ALLOCATABLE :: in(:,:,:,:,:)
  n1 = 1; n2 = 2; n3 = 3; n4 = 4
  ALLOCATE(in(na,n4,n3,n2,n1))
  in = 0.0
  CALL ASub(n1, n2, n3, n4, in)
END PROGRAM omp[/cpp]
0 Kudos
1 Solution
jimdempseyatthecove
Honored Contributor III
1,119 Views

Try this
...
CALL BSub(na, slice, array_in(:,1,1,1,i1))
...
SUBROUTINE BSub(n,array1, array2)
! Arguments
INTEGER :: n
REAL, INTENT(IN) :: array1(n)
REAL, INTENT(IN) :: array2(n)

or try this

CALL BSub(slice, array_in(:,1,1,1,i1)) ! as in original sample code
...
SUBROUTINE BSub(array1, array2)
! Arguments
REAL, INTENT(IN) :: array1(:)
REAL, INTENT(IN) :: array2(:,:,:,:,:) ! and fixup references in your code

My assumption is the original code was creating a rank 1 array descriptor

a) as static (SAVE) as opposed to on stack in scope of parallel region
b) as shared stack array descriptor in scope _prior to_ scope of parallel region. (shared)

The first suggestion may run faster.

Jim Dempsey



View solution in original post

0 Kudos
4 Replies
bmchenry
New Contributor II
1,119 Views

quick looks reveals Allocate(in(na,n4,n3,n2,n1)
item 'Na' is undefined

oops! on 2nd check i see it's defined in in the module
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,120 Views

Try this
...
CALL BSub(na, slice, array_in(:,1,1,1,i1))
...
SUBROUTINE BSub(n,array1, array2)
! Arguments
INTEGER :: n
REAL, INTENT(IN) :: array1(n)
REAL, INTENT(IN) :: array2(n)

or try this

CALL BSub(slice, array_in(:,1,1,1,i1)) ! as in original sample code
...
SUBROUTINE BSub(array1, array2)
! Arguments
REAL, INTENT(IN) :: array1(:)
REAL, INTENT(IN) :: array2(:,:,:,:,:) ! and fixup references in your code

My assumption is the original code was creating a rank 1 array descriptor

a) as static (SAVE) as opposed to on stack in scope of parallel region
b) as shared stack array descriptor in scope _prior to_ scope of parallel region. (shared)

The first suggestion may run faster.

Jim Dempsey



0 Kudos
IanH
Honored Contributor III
1,119 Views
Thanks muchly - both those solutions work well. The descriptor explanation also nicely explains some other problems that I was having.

What is your understanding - should passing descriptors like the original code work? Or to be correct code for OpenMP must I always use one of your two alternatives you posted when I'm passing arrays around?

IanH

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,119 Views

IanH,

I am glad both methods worked (didn't try them myself :~).

Nothing wrong in passing descriptors as long as the discriptor that gets passed is the one you intended on passing.

In your original example you were passing a rank 5 array to a function that required a descriptor for a rank 1 array. The compiler had to generate the rank 1 array descriptor. It chose to allocate room for and placement of the descriptor in the stack scope (or static)outside that of the parallel region. IMHO this is a bug because you had DEFAULT(NONE), the compiler should have errored out. Had you had DEFAULT(SHARED) it would raise an interesting paradox (for you) as the new rank 1 array descriptor would implicitly be shared, however, it is being auto-generated using indicies that are PRIVATE. The intentions are not so clear when viewed in this respect. I would suspect that due to this being a problem, that there also may be a problem with TRANSFER assuming it is legal to have SomeFunc(TRANSFER (source,mold[,size])) - i.e. search your solution for TRANSFER inside parallel region.

In the two solutions I proposed, the creation of the temporary rank 1 array descriptor is eliminated.

Passing the extent in, for rank 1 arrays, will (in this example) generate faster code. But this looks F77-ish (regardless of being more efficient).

Jim Dempsey


0 Kudos
Reply