Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29276 Discussions

workaround for using derived type with allocatable components within OpenMP regions

Crni_G_
Beginner
756 Views

I'm working on a CFD code, operating on several large 3D arrays.  Coming from C background, I was thinking to encapsulate these arrays into a derived type:

type variables
  integer :: nx, ny, nz ! Grid size.
  real, allocatable, dimension(:, :, :) :: p_1 ! Pressure at t = t.
  real, allocatable, dimension(:, :, :) :: p_2 ! Pressure at t = t + dt.
  ! etc.
end type

At the end of each simulation time step, I'm updating arrays to prepare for the next time step:

type(variables) :: vars
! ...
!$OMP PARALLEL WORKSHARE
  vars%p_1 = vars%p_2
  ! etc.
!$OMP END PARALLEL WORKSHARE

That was working fine with GNU Fortran (version 4.4.7), however with Intel Fortran (version 13.0.1 20121010) the code will crash within OpenMP region.  Looking further into the issue, I learned that using derived types with allocatable components is unsupported with OpenMP, and am looking now into how to overcome the issue.  I would still prefer to keep arrays encapsulated into this derived type.  I've noticed that the crash is avoided if I create a subroutine for OpenMP region above:

subroutine update(nx, ny, nz, p_1, p_2)
  integer, intent(in) :: nx, ny, nz
  real, dimension(nx, ny, nz), intent(out) :: p_1
  real, dimension(nx, ny, nz), intent(in) :: p_2

!$OMP PARALLEL WORKSHARE
  p_1 = p_2
!$OMP END PARALLEL WORKSHARE
end subroutine

and if I just pass arrays from derived type into this subroutine:

  call update(nx, ny, nz, vars%p_1, vars%p_2)

It's somewhat cumbersome approach, still I'm OK with it.  But, my question is: is this code now valid, is there any undefined behavior left here?

Thanks.

0 Kudos
4 Replies
jimdempseyatthecove
Honored Contributor III
756 Views

You might fine the following to be a bit more efficient:

subroutine update(nxyz, p_1, p_2)
  integer, intent(in) :: nxyz
  real, dimension(*), intent(out) :: p_1
  real, dimension(*), intent(in) :: p_2

!$OMP PARALLEL WORKSHARE
  p_1(1:nxyz) = p_2(1:nxyz)
!$OMP END PARALLEL WORKSHARE
end subroutine

...

call update(nx*ny*nz, vars%p_1, vars%p_2)

Jim Dempsey

0 Kudos
Crni_G_
Beginner
756 Views

Thanks for the reply, and for performance hint, but - do you think the approach that I tried to explain in my first message is appropriate for avoiding issues between OpenMP and allocatable components of derived types?

Thanks.

0 Kudos
jimdempseyatthecove
Honored Contributor III
756 Views

IMHO, your original post code should have worked as vars is default shared. I suspect this was a bug.

My suggested code often produces more efficient code, due to the optimizer not necessarily performing the equivalent of unroll(3). V16 update 1 does still have code efficiency issues of well vectorized ArrayOut = ArrayIn where the two arrays are multi-dimensional. I've seen cases on Xeon Phi (V16 update 1)

real(8) :: Array(6,6) ! subroutine local array
...
Array = 0.0_8

producing quite inefficient code. For an allocatable array (known to be contiguous), passing a contiguous multi-dimension slice (or whole), to a dimension(*) array as in #2 can experience significant performance gain for the given operation. If performance is of lessor concern than program elegance then continue to use vars%p_1 = vars%p_2 without the subroutine... and wait for the compiler optimization to catch up.

Jim Dempsey

0 Kudos
Crni_G_
Beginner
756 Views

Great, thanks again for your replies!

0 Kudos
Reply