Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

How to avoid "vector dependence"

Andreas_Klaedtke
Beginner
1,186 Views
The following code, basically a reduction operation, should be vectorized. But all icpc versions I have tried give me a vector dependence message which I can not get rid off. I have tried #pragma vector always and #pragma ivdep.
[cpp]void reduce (size_t const N,
             float const * RESTRICT const x, 
             float const * RESTRICT const y,
             float const * RESTRICT const z,
             float const * RESTRICT const v,
             float * RESTRICT const A)
{
   for (size_t i = 0; i < 10; ++i) {
      A = 0;
   }

#pragma vector always
#pragma ivdep
   for (size_t i = 0; i < N; ++i) {
      A[0] += x * y;
      A[1] += x * x;
      A[2] += y * y;
      A[3] += x * x * y * y;
      A[4] += x * y * z;
   }
}
[/cpp]
The latest compiler version 12.0.0 gives the following vec-report:
redux2.cc(24) (col. 4): remark: loop was not vectorized: vectorization possible but seems inefficient.
redux2.cc(30) (col. 4): remark: loop was not vectorized: existence of vector dependence.
redux2.cc(31) (col. 7): remark: vector dependence: assumed ANTI dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(31) (col. 7): remark: vector dependence: assumed FLOW dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.

Any insight would be greatly appreciated.
0 Kudos
3 Replies
TimP
Honored Contributor III
1,186 Views
Do you #define RESTRICT=restrict and set icpc -restrict ? If you would make local scalar variables for each of the sums, and copy them to the array afterwards, you shouldn't require any non-standard stuff, except that you would require either -fp-model fast (the default), if you don't adopt the #pragma simd reduction.
0 Kudos
Andreas_Klaedtke
Beginner
1,186 Views
Indeed, RESTRICT is defined to be restrict if the -restrict option is used with the compiler.

As you saw in the other thread, I have tried the local scalar variables successfully.

As I see vectorisation and parallelisation as sort of compiler optimisations that should not require the code to be altered, maybe with the exception of "hints", pragmas, etc., I would like to not add more lines of code than absolutely necessary to make it work.
Furthermore, I think that there is no vector dependence that I can see, so the compiler should vectorize. Especially, if I state the #pragma ivdep, shouldn't it?
At least it does not report that the dependence is proven.

Any idea how to get the compiler to honour the ivdep pragma? Or am I missing the dependency?
0 Kudos
TimP
Honored Contributor III
1,186 Views
It certainly appears that *restrict should take care of the dependencies which otherwise would be assumed. However, summing into scalars often results in better code, and would avoid any question of dependencies, even without ivdep or restrict.
0 Kudos
Reply