Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28454 Discussions

-xmic-avx512 compiler flag disables alignment

dogunter
Beginner
405 Views

I'm using Intel Fortran 18.0.1.163 on a system with KNL cpus on the compute nodes. Here is my test code,

module align_test_module

  implicit none


  contains

    subroutine do_something(nVertLevels)
    integer, pointer :: nVertLevels
    integer :: i
    real, dimension(:), allocatable :: uTemp

    allocate(uTemp(nVertLevels))

    !$omp simd aligned(uTemp:64)
    do i = 1, nVertLevels
      uTemp(i) = 0.0
    end do
    end subroutine do_something

end module align_test_module
And here is my compile line,
$ ifort -c align_test.f90 -convert big_endian -FR -xmic-avx512 -fimf-use-svml -qopt-report-phase=vec -qopt-report=5 -align array64byte

And here is the optrpt output,

Intel(R) Advisor can now assist with vectorization and show optimization
  report messages with your source code.
See "https://software.intel.com/en-us/intel-advisor-xe" for details.

Intel(R) Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 18.0.1.163 Build 20171018

Compiler options: -c -convert big_endian -FR -xmic-avx512 -fimf-use-svml -qopt-report-phase=vec -qopt-report=5 -align array64byte

Begin optimization report for: ALIGN_TEST_MODULE::DO_SOMETHING

    Report from: Vector optimizations [vec]


LOOP BEGIN at align_test.f90(17,5)
   remark #15542: loop was not vectorized: inner loop was already vectorized

   LOOP BEGIN at align_test.f90(18,7)
      remark #15542: loop was not vectorized: inner loop was already vectorized

      LOOP BEGIN at align_test.f90(18,7)
         remark #15389: vectorization support: reference UTEMP(:) has unaligned access
         remark #15381: vectorization support: unaligned access used inside loop body
         remark #15305: vectorization support: vector length 16
         remark #15309: vectorization support: normalized vectorization overhead 0.600
         remark #15300: LOOP WAS VECTORIZED
         remark #15451: unmasked unaligned unit stride stores: 1 
         remark #15475: --- begin vector cost summary ---
         remark #15476: scalar cost: 3 
         remark #15477: vector cost: 0.310 
         remark #15478: estimated potential speedup: 4.000 
         remark #15488: --- end vector cost summary ---
      LOOP END

      LOOP BEGIN at align_test.f90(18,7)
      <Remainder loop for vectorization>
         remark #15389: vectorization support: reference UTEMP(:) has unaligned access
         remark #15381: vectorization support: unaligned access used inside loop body
         remark #15305: vectorization support: vector length 8
         remark #15309: vectorization support: normalized vectorization overhead 1.250
         remark #15301: REMAINDER LOOP WAS VECTORIZED
      LOOP END
   LOOP END
LOOP END
===========================================================================

Now if I remove the -xmic-avx512 flag then the "unaligned access" message goes away but then I know longer have a binary that is optimized for the KNL.

0 Kudos
0 Replies
Reply