Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Ifort array alignment problem

unrue
Beginner
1,519 Views

Dear Intel Fortran developers, 

 

I'm using ifort 15.0.1 in my code. I have a problem with array alignment. This is a piece of my code in a subroutine

 

real(stnd), dimension(:), allocatable :: dsr_sel
!DIR$ ATTRIBUTES ALIGN: 32 :: dsr_sel


allocate(dsr_sel(ntracesM))

s = 1
e = ntracesM

dsr_sel(s:e) =  off_co(s:e)*off_co(s:e)*inv_off2 +  &
                       (  x_co(s:e)-x0cur )*( x_co(s:e)-x0cur )*inv_ap2



where x_co and off_co are array defined in a module and allocated in another point by using !DIR$ ATTRIBUTES ALIGN : 32 :: x_co, off_co. The problem is opt report file give me: 

LOOP BEGIN at project/src/test.f90(646,8)
<Peeled>
LOOP END

LOOP BEGIN at project/src/test.f90(646,8)
   remark #15389: vectorization support: reference DSR_SEL has unaligned access
   remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
   remark #15381: vectorization support: unaligned access used inside loop body
   remark #15399: vectorization support: unroll factor set to 2
   remark #15300: LOOP WAS VECTORIZED
   remark #15442: entire loop may be executed in remainder
   remark #15450: unmasked unaligned unit stride loads: 2
   remark #15451: unmasked unaligned unit stride stores: 1
   remark #15475: --- begin vector loop cost summary ---
   remark #15476: scalar loop cost: 20
   remark #15477: vector loop cost: 5.000
   remark #15478: estimated potential speedup: 6.120
   remark #15479: lightweight vector operations: 17
   remark #15488: --- end vector loop cost summary ---
LOOP END

LOOP BEGIN at project/src/test.f90(646,8)
<Remainder>
   remark #15389: vectorization support: reference DSR_SEL has unaligned access
   remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
   remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
   remark #15381: vectorization support: unaligned access used inside loop body
   remark #15301: REMAINDER LOOP WAS VECTORIZED
LOOP END

I don't understand why these tree vectors have unaligned accesses. I compile with -xHost and -O3 flags on Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz. I tried also -align array32byte with no results. 

Thanks.

0 Kudos
13 Replies
Lorri_M_Intel
Employee
1,519 Views

There is very little to go on here without a compilable program.

What is "stnd" declared as?  What are the declarations of x_co and off_co?

               --Lorri

0 Kudos
unrue
Beginner
1,519 Views

Hi lorri,

"stnd" is simply a REAL4, and for the second question:

 

real(stnd),    dimension( : ), allocatable :: off_co, x_co

 

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,519 Views

Are those arrays attributed for alignment? (32 bytes)

Jim Dempsey

0 Kudos
unrue
Beginner
1,519 Views

jimdempseyatthecove wrote:

Are those arrays attributed for alignment? (32 bytes)

Jim Dempsey

 

Yes, I used  !DIR$ ATTRIBUTES ALIGN : 32 :: x_co, off_co

0 Kudos
Xiaoping_D_Intel
Employee
1,519 Views

The compiler can't generate aligned load for allocatable arrays or pointers declared in other module with 'ALIGN' attribute because it doesn't know the lower bound which is dynamic.

In 17.0 compiler which is still in beta there is a solution: to specify which array element is aligned with "ASSUME_ALIGNED" directive before the loop using the array or pointer:

!dir$ assume_aligned x_co(1):32, off_co(1):32

To enroll the 17.0 beta please refer to the first sticky post in this forum.

Thanks,

Xiaoping

 

 

 

0 Kudos
unrue
Beginner
1,519 Views

I read this article:

https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization

where is explained that is it possible to align allocatable arrays in modules. Maybe I understood wrong?

0 Kudos
TimP
Honored Contributor III
1,519 Views

That article explains that static arrays may be recognized as aligned by the method you have chosen, but not allocatable.

I must check whether align array32byte is having any useful effect in a similar situation, so it's helpful to me that you brought it up.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,519 Views

xiaoping-duan (Intel) wrote:

The compiler can't generate aligned load for allocatable arrays or pointers declared in other module with 'ALIGN' attribute because it doesn't know the lower bound which is dynamic.

Wouldn't the lower bound, whatever it is, always be aligned?...

Assume the array descriptor is passed to a subroutine that declares the dummy as allocatable without the alignment requirement. Then allocates or resizes the array. However, such a declaration would also require use of an interface, and interface checking should catch the discrepancy between the actual and dummy.

I can see an issue with pointer, but not with the array.

The compiler (some day) may permit a pointer to be declared with alignment requirement, and then assert each => is aligned. Most of this could be done at compile time, but when not possible a runtime check could be made.

Jim Dempsey

0 Kudos
Xiaoping_D_Intel
Employee
1,519 Views

The lower bound isn't always aligned if an array gets reallocated or an pointer gets re-associated.

Thanks,

Xiaoping

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,519 Views

Pardon me...

module mod
  real, dimension(:), allocatable :: array
  !DIR$ ATTRIBUTES ALIGN: 32 :: array
end module mod

program foo
  use mod
  real :: bigger(123)
  ...
  allocate(array(-1:8)) ! array(-1) is aligned
  ...
  array = bigger ! array now is (1:123)
  ! array(1) should be aligned if auto-realloc obeys the attribute
end program foo

Are you saying the auto-reallocation will not obey the alignment attributed to the array???
I cannot fathom that the attribute would go away.

Jim Dempsey

0 Kudos
Xiaoping_D_Intel
Employee
1,519 Views

jimdempseyatthecove wrote:

......

  array = bigger ! array now is (1:123)

  ! array(1) should be aligned if auto-realloc obeys the attribute
......

Are you saying the auto-reallocation will not obey the alignment attributed to the array???
I cannot fathom that the attribute would go away.

Jim Dempsey

There is no reallocation happened in the array assignment. It is just content value copy and no memory address change. The lower bound of ‘array' after the assignment is still '-1'. An example of array reallocation is a call to intrinsic subroutine "move_alloc" which will not check the alignment attribute of its actual arguments.

 

Thanks,

Xiaoping

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,519 Views

Look closer at the code. The rhs is larger than the lhs. The newer Fortran standard performs an auto-realloclhs.

move_alloc predates the autorealloc.

With F2003 enabled (default to autorealloclhs):

module mod_foo
  real, dimension(:), allocatable :: array
  !DIR$ ATTRIBUTES ALIGN: 32 :: array
  real, dimension(:), allocatable  :: bigger
end module mod_foo

program foo
  use mod_foo
  !...
  allocate(array(-1:8)) ! array(-1) is aligned
  allocate(bigger(123))
  print *, lbound(array), loc(array(lbound(array))), mod(loc(array(lbound(array))),32), size(array)
  array = bigger ! array now is (1:123)
  print *, lbound(array), loc(array(lbound(array))), mod(loc(array(lbound(array))),32), size(array)
end program foo
 -1 8093248 0 10
 1 8074560 0 123

In the above program, array = bigger, reallocated array. The lbound of the array changed to default of 1 (as it should) and the alignment maintained (though I have not seen cases contrary to alignment being maintained).

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
1,519 Views

I agree with Jim - reallocation would not ignore alignment specification and the value of the lower bound doesn't matter.

0 Kudos
Reply