- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear Intel Fortran developers,
I'm using ifort 15.0.1 in my code. I have a problem with array alignment. This is a piece of my code in a subroutine
real(stnd), dimension(:), allocatable :: dsr_sel !DIR$ ATTRIBUTES ALIGN: 32 :: dsr_sel allocate(dsr_sel(ntracesM)) s = 1 e = ntracesM dsr_sel(s:e) = off_co(s:e)*off_co(s:e)*inv_off2 + & ( x_co(s:e)-x0cur )*( x_co(s:e)-x0cur )*inv_ap2
where x_co and off_co are array defined in a module and allocated in another point by using !DIR$ ATTRIBUTES ALIGN : 32 :: x_co, off_co. The problem is opt report file give me:
LOOP BEGIN at project/src/test.f90(646,8)
<Peeled>
LOOP END
LOOP BEGIN at project/src/test.f90(646,8)
remark #15389: vectorization support: reference DSR_SEL has unaligned access
remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
remark #15381: vectorization support: unaligned access used inside loop body
remark #15399: vectorization support: unroll factor set to 2
remark #15300: LOOP WAS VECTORIZED
remark #15442: entire loop may be executed in remainder
remark #15450: unmasked unaligned unit stride loads: 2
remark #15451: unmasked unaligned unit stride stores: 1
remark #15475: --- begin vector loop cost summary ---
remark #15476: scalar loop cost: 20
remark #15477: vector loop cost: 5.000
remark #15478: estimated potential speedup: 6.120
remark #15479: lightweight vector operations: 17
remark #15488: --- end vector loop cost summary ---
LOOP END
LOOP BEGIN at project/src/test.f90(646,8)
<Remainder>
remark #15389: vectorization support: reference DSR_SEL has unaligned access
remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_off_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
remark #15389: vectorization support: reference test_mp_x_co_ has unaligned access
remark #15381: vectorization support: unaligned access used inside loop body
remark #15301: REMAINDER LOOP WAS VECTORIZED
LOOP END
I don't understand why these tree vectors have unaligned accesses. I compile with -xHost and -O3 flags on Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz. I tried also -align array32byte with no results.
Thanks.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
There is very little to go on here without a compilable program.
What is "stnd" declared as? What are the declarations of x_co and off_co?
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi lorri,
"stnd" is simply a REAL4, and for the second question:
real(stnd), dimension( : ), allocatable :: off_co, x_co
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are those arrays attributed for alignment? (32 bytes)
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
Are those arrays attributed for alignment? (32 bytes)
Jim Dempsey
Yes, I used !DIR$ ATTRIBUTES ALIGN : 32 :: x_co, off_co
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler can't generate aligned load for allocatable arrays or pointers declared in other module with 'ALIGN' attribute because it doesn't know the lower bound which is dynamic.
In 17.0 compiler which is still in beta there is a solution: to specify which array element is aligned with "ASSUME_ALIGNED" directive before the loop using the array or pointer:
!dir$ assume_aligned x_co(1):32, off_co(1):32
To enroll the 17.0 beta please refer to the first sticky post in this forum.
Thanks,
Xiaoping
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I read this article:
https://software.intel.com/en-us/articles/data-alignment-to-assist-vectorization
where is explained that is it possible to align allocatable arrays in modules. Maybe I understood wrong?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That article explains that static arrays may be recognized as aligned by the method you have chosen, but not allocatable.
I must check whether align array32byte is having any useful effect in a similar situation, so it's helpful to me that you brought it up.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
xiaoping-duan (Intel) wrote:
The compiler can't generate aligned load for allocatable arrays or pointers declared in other module with 'ALIGN' attribute because it doesn't know the lower bound which is dynamic.
Wouldn't the lower bound, whatever it is, always be aligned?...
Assume the array descriptor is passed to a subroutine that declares the dummy as allocatable without the alignment requirement. Then allocates or resizes the array. However, such a declaration would also require use of an interface, and interface checking should catch the discrepancy between the actual and dummy.
I can see an issue with pointer, but not with the array.
The compiler (some day) may permit a pointer to be declared with alignment requirement, and then assert each => is aligned. Most of this could be done at compile time, but when not possible a runtime check could be made.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The lower bound isn't always aligned if an array gets reallocated or an pointer gets re-associated.
Thanks,
Xiaoping
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Pardon me...
module mod real, dimension(:), allocatable :: array !DIR$ ATTRIBUTES ALIGN: 32 :: array end module mod program foo use mod real :: bigger(123) ... allocate(array(-1:8)) ! array(-1) is aligned ... array = bigger ! array now is (1:123) ! array(1) should be aligned if auto-realloc obeys the attribute end program foo
Are you saying the auto-reallocation will not obey the alignment attributed to the array???
I cannot fathom that the attribute would go away.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
jimdempseyatthecove wrote:
......
array = bigger ! array now is (1:123)
! array(1) should be aligned if auto-realloc obeys the attribute ......Are you saying the auto-reallocation will not obey the alignment attributed to the array???
I cannot fathom that the attribute would go away.Jim Dempsey
There is no reallocation happened in the array assignment. It is just content value copy and no memory address change. The lower bound of ‘array' after the assignment is still '-1'. An example of array reallocation is a call to intrinsic subroutine "move_alloc" which will not check the alignment attribute of its actual arguments.
Thanks,
Xiaoping
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Look closer at the code. The rhs is larger than the lhs. The newer Fortran standard performs an auto-realloclhs.
move_alloc predates the autorealloc.
With F2003 enabled (default to autorealloclhs):
module mod_foo real, dimension(:), allocatable :: array !DIR$ ATTRIBUTES ALIGN: 32 :: array real, dimension(:), allocatable :: bigger end module mod_foo program foo use mod_foo !... allocate(array(-1:8)) ! array(-1) is aligned allocate(bigger(123)) print *, lbound(array), loc(array(lbound(array))), mod(loc(array(lbound(array))),32), size(array) array = bigger ! array now is (1:123) print *, lbound(array), loc(array(lbound(array))), mod(loc(array(lbound(array))),32), size(array) end program foo
-1 8093248 0 10 1 8074560 0 123
In the above program, array = bigger, reallocated array. The lbound of the array changed to default of 1 (as it should) and the alignment maintained (though I have not seen cases contrary to alignment being maintained).
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree with Jim - reallocation would not ignore alignment specification and the value of the lower bound doesn't matter.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page