Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
26730 Discussions

ifort 14.0 dot_product vectorizing differently with different array subscripts

Aaron_O_
Beginner
107 Views

Greetings,

I was looking at the vectorization report for a code and I found different vectorization behavior for dot_product() based on the array subscripts. The following statements are identical except for the lower bound of the array indexes (in the first it is the variable "n1", in the second it is the constant "1"):

 (75)   disap = disap + dot_product(ym(n1:nmax2), pMergeBulk(n1:nmax2,n))
    
 (77)   disap = disap + dot_product(ym(1:nmax2), pMergeBulk(1:nmax2,n))

after compiling with vec-report6 I got the following results:

(75): (col. 21) remark: loop was not vectorized: existence of vector dependence
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75

(77): (col. 21) remark: vectorization support: reference YM has aligned access
(77): (col. 21) remark: vectorization support: unroll factor set to 4
(77): (col. 21) remark: LOOP WAS VECTORIZED

What is the reason the first statement gives a vector dependence while the second does not?

0 Kudos
4 Replies
Steve_Lionel
Black Belt Retired Employee
107 Views

You may not get much interest in a four-year-old compiler. Try it with version 18 and see what it does. But I might guess that the compiler doesn't know what n1 is so it is pessimistic.

Aaron_O_
Beginner
107 Views

Steve Lionel (Ret.) wrote:

You may not get much interest in a four-year-old compiler. Try it with version 18 and see what it does. But I might guess that the compiler doesn't know what n1 is so it is pessimistic.

Hi Steve,

Thanks for the quick response. 14 is what they have at work, so I think I'm stuck with that until they decide to upgrade. If the behavior is because it is an old compiler I can accept that, I just wanted to make sure I wasn't doing something obviously wrong (e.g., there isn't some line in a standard somewhere that says "loops with both indexes as variables will not be vectorized" or some such).

andrew_4619
Honored Contributor I
107 Views

Vectorising is an optimisation for speed and has nothing to do with the fortran standards. How efficient (or not) the code is  is only down to the compiler vendors implementation.  

Steve_Lionel
Black Belt Retired Employee
107 Views

Most compilers that support vectorization also have directives you can place in the code to give the compiler more information. I don't recall all of what version 14 had, but you can study the Optimization section of the Intel documentation and see what works. I think version 14 also had a "Guided Auto Parallelism" feature that could give you advice.

Fundamentally, the compiler has to err on the side of correctness. If it can't prove that an optimization won't give bad results (as opposed to slightly different results), it won't do the optimization. In your case, the compiler can't figure out if there is overlap.

Reply