Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

OpenMP Declare SIMD alignment

M___Nils
Beginner
1,272 Views

I want to create a vectorized version of a function. MWE:

SUBROUTINE simd_alignment(x, y)
!$OMP declare simd (simd_alignment) processor(skylake_avx512) linear(ref(x, y)) aligned(x, y: 64)

  REAL, INTENT(IN) :: x
  REAL, INTENT(OUT) :: y

  y = (x-1.0) * 10.0
END SUBROUTINE simd_alignment

Compiling this with the flags:

-qopenmp-simd -align array64byte -O3 -no-prec-div -xCORE-AVX512 -mtune=skylake -g -traceback -fpp -qopt-report=5 -qopt-report-phase=vec,loop -ip -heap-arrays -r8  -convert big_endian

results in the following report

Begin optimization report for: EQUIL::SIMD_ALIGNMENT..ZN16R8R8

    Report from: Loop nest & Vector optimizations [loop, vec]

remark #15389: vectorization support: reference y has unaligned access   [ equil.F90(1169,3) ]
remark #15389: vectorization support: reference x has unaligned access   [ equil.F90(1169,3) ]
remark #15381: vectorization support: unaligned access used inside loop body
remark #15347: FUNCTION WAS VECTORIZED with zmm, simdlen=16, unmasked, formal parameter types: (linear_ref:8,linear_ref:8) 
remark #15305: vectorization support: vector length 16
remark #15450: unmasked unaligned unit stride loads: 1 
remark #15451: unmasked unaligned unit stride stores: 1 
===========================================================================

Begin optimization report for: EQUIL::SIMD_ALIGNMENT..ZM16R8R8

    Report from: Loop nest & Vector optimizations [loop, vec]

remark #15389: vectorization support: reference y has unaligned access   [ equil.F90(1169,3) ]
remark #15389: vectorization support: reference x has unaligned access   [ equil.F90(1169,3) ]
remark #15381: vectorization support: unaligned access used inside loop body
remark #15347: FUNCTION WAS VECTORIZED with zmm, simdlen=16, masked, formal parameter types: (linear_ref:8,linear_ref:8) 
remark #15305: vectorization support: vector length 16
remark #15456: masked unaligned unit stride loads: 1 
remark #15457: masked unaligned unit stride stores: 1 
===========================================================================

How can I get rid of the unaligned accesses? The OpenMP standard says

The type of list items appearing in the aligned clause must be C_PTR or Cray pointer, or the
list item must have the POINTER or ALLOCATABLE attribute.

for the aligned attribute in a declare simd clause. This doesn't make sense in my case.

I know that I can use a function in combination with bind(C) and the value attribute, but I need to be able to return multiple values.

0 Kudos
0 Replies
Reply