- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am currently developing a performance critical project and I am facing some behavior I don't understand. To explain my thoughts please consider the following module file
!----------------------------------------------------------------------- module dcs_c7h16_reitz_mod !----------------------------------------------------------------------- implicit none !----------------------------------------------------------------------- private !----------------------------------------------------------------------- !----------------------------------------------------------------------- integer, parameter :: dp = selected_real_kind(15,307) !----------------------------------------------------------------------- real(kind=dp), dimension(:), allocatable :: temp !DEC$ ATTRIBUTES ALIGN: 32 :: temp real(kind=dp), dimension(:), allocatable, target :: lt !DEC$ ATTRIBUTES ALIGN: 32 :: lt ! contains !----------------------------------------------------------------------- !----------------------------------------------------------------------- subroutine dcs_update_c7h16_reitz(ngridpoints,temperature) implicit none integer, intent(in) :: ngridpoints real(kind=dp) , intent(in) :: temperature(ngridpoints) integer :: i !DIR$ ASSUME_ALIGNED temperature: 32 temp = temperature lt = log(temperature) end subroutine dcs_update_c7h16_reitz end module dcs_c7h16_reitz_mod
When I compile this module on my machine (Intel(R) Core(TM) i7-3770, with AVX) using
ifort -c -vec-report6 -xHost -align array32byte test.f90
I got the following vectorization report which surprises me
test.f90(28): (col. 2) remark: vectorization support: reference dcs_c7h16_reitz_mod_mp_temp_ has aligned access test.f90(28): (col. 2) remark: vectorization support: reference temperature has unaligned access test.f90(28): (col. 2) remark: vectorization support: unaligned access used inside loop body test.f90(28): (col. 2) remark: LOOP WAS VECTORIZED test.f90(28): (col. 2) remark: loop was not vectorized: not inner loop test.f90(30): (col. 10) remark: vectorization support: reference temperature has unaligned access test.f90(30): (col. 5) remark: vectorization support: reference dcs_c7h16_reitz_mod_mp_lt_ has aligned access test.f90(30): (col. 5) remark: vectorization support: unaligned access used inside loop body test.f90(30): (col. 5) remark: LOOP WAS VECTORIZED
What confuses me is that there is an unaligned access to temperature in both lines referencing it. After reading the web pages "Data Alignment to Assist Vectorization" and "Fortran Array Data and Arguments and Vectorization" , which are very good by the way, I played around with the "assumed_alligned" directive and the -align compiler option. However, the problem still persists.
After quite some time I found out that when I am not using the -xHost option the compiler seems to produced aligned data access
ifort -c -vec-report6 -align array32byte test.f90
test.f90(28): (col. 2) remark: vectorization support: reference dcs_c7h16_reitz_mod_mp_temp_ has aligned access test.f90(28): (col. 2) remark: vectorization support: reference temperature has aligned access test.f90(28): (col. 2) remark: LOOP WAS VECTORIZED test.f90(28): (col. 2) remark: loop was not vectorized: not inner loop test.f90(30): (col. 10) remark: vectorization support: reference temperature has aligned access test.f90(30): (col. 5) remark: vectorization support: reference dcs_c7h16_reitz_mod_mp_lt_ has aligned access test.f90(30): (col. 5) remark: LOOP WAS VECTORIZED
I should mention that the problem of "no aligned access" also occurs when I am using the -xAVX option. My question is now what is possibly wrong with the above code that the compiler assumes unaligned access when compiling the code on an Intel-AVX platform. I would also be very thankful if somebody has a reference for further reading on that topic since the code project I am working on is really performance critical.
Thanks a lot in advance
Felix
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've occasionally run into cases where the compiler didn't take advantage of asserted 32-byte alignment. This may or may not have a significant impact. Without -xHost only the default 16-byte alignment would affect code generation, even though the 32-byte alignment may still prove beneficial.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page