Quote:Steve Lionel (Ret.)

Aaron_O_ · ‎10-10-2017

Greetings,

I was looking at the vectorization report for a code and I found different vectorization behavior for dot_product() based on the array subscripts. The following statements are identical except for the lower bound of the array indexes (in the first it is the variable "n1", in the second it is the constant "1"):

 (75)   disap = disap + dot_product(ym(n1:nmax2), pMergeBulk(n1:nmax2,n))
    
 (77)   disap = disap + dot_product(ym(1:nmax2), pMergeBulk(1:nmax2,n))

after compiling with vec-report6 I got the following results:

(75): (col. 21) remark: loop was not vectorized: existence of vector dependence
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed FLOW dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75
(75): (col. 21) remark: vector dependence: assumed ANTI dependence between .T103_ line 75 and .T103_ line 75

(77): (col. 21) remark: vectorization support: reference YM has aligned access
(77): (col. 21) remark: vectorization support: unroll factor set to 4
(77): (col. 21) remark: LOOP WAS VECTORIZED

What is the reason the first statement gives a vector dependence while the second does not?

Steve_Lionel · ‎10-10-2017

You may not get much interest in a four-year-old compiler. Try it with version 18 and see what it does. But I might guess that the compiler doesn't know what n1 is so it is pessimistic.

Aaron_O_ · ‎10-10-2017

Steve Lionel (Ret.) wrote:

You may not get much interest in a four-year-old compiler. Try it with version 18 and see what it does. But I might guess that the compiler doesn't know what n1 is so it is pessimistic.

Hi Steve,

Thanks for the quick response. 14 is what they have at work, so I think I'm stuck with that until they decide to upgrade. If the behavior is because it is an old compiler I can accept that, I just wanted to make sure I wasn't doing something obviously wrong (e.g., there isn't some line in a standard somewhere that says "loops with both indexes as variables will not be vectorized" or some such).

andrew_4619 · ‎10-11-2017

Vectorising is an optimisation for speed and has nothing to do with the fortran standards. How efficient (or not) the code is is only down to the compiler vendors implementation.

Steve_Lionel · ‎10-11-2017

Most compilers that support vectorization also have directives you can place in the code to give the compiler more information. I don't recall all of what version 14 had, but you can study the Optimization section of the Intel documentation and see what works. I think version 14 also had a "Guided Auto Parallelism" feature that could give you advice.

Fundamentally, the compiler has to err on the side of correctness. If it can't prove that an optimization won't give bad results (as opposed to slightly different results), it won't do the optimization. In your case, the compiler can't figure out if there is overlap.

ifort 14.0 dot_product vectorizing differently with different array subscripts