- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The following code, basically a reduction operation, should be vectorized. But all icpc versions I have tried give me a vector dependence message which I can not get rid off. I have tried #pragma vector always and #pragma ivdep.
redux2.cc(24) (col. 4): remark: loop was not vectorized: vectorization possible but seems inefficient.
redux2.cc(30) (col. 4): remark: loop was not vectorized: existence of vector dependence.
redux2.cc(31) (col. 7): remark: vector dependence: assumed ANTI dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(31) (col. 7): remark: vector dependence: assumed FLOW dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.
Any insight would be greatly appreciated.
[cpp]void reduce (size_t const N, float const * RESTRICT const x, float const * RESTRICT const y, float const * RESTRICT const z, float const * RESTRICT const v, float * RESTRICT const A) { for (size_t i = 0; i < 10; ++i) { A = 0; } #pragma vector always #pragma ivdep for (size_t i = 0; i < N; ++i) { A[0] += x * y; A[1] += x * x; A[2] += y * y; A[3] += x * x * y * y; A[4] += x * y * z; } } [/cpp]The latest compiler version 12.0.0 gives the following vec-report:
redux2.cc(24) (col. 4): remark: loop was not vectorized: vectorization possible but seems inefficient.
redux2.cc(30) (col. 4): remark: loop was not vectorized: existence of vector dependence.
redux2.cc(31) (col. 7): remark: vector dependence: assumed ANTI dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(31) (col. 7): remark: vector dependence: assumed FLOW dependence between (unknown) line 31 and (unknown) line 31.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 35 and A line 35.
redux2.cc(35) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 35 and A line 35.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 34 and A line 34.
redux2.cc(34) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 34 and A line 34.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 33 and A line 33.
redux2.cc(33) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 33 and A line 33.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed FLOW dependence between A line 32 and A line 32.
redux2.cc(32) (col. 7): remark: vector dependence: assumed ANTI dependence between A line 32 and A line 32.
Any insight would be greatly appreciated.
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you #define RESTRICT=restrict and set icpc -restrict ? If you would make local scalar variables for each of the sums, and copy them to the array afterwards, you shouldn't require any non-standard stuff, except that you would require either -fp-model fast (the default), if you don't adopt the #pragma simd reduction.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Indeed, RESTRICT is defined to be restrict if the -restrict option is used with the compiler.
As you saw in the other thread, I have tried the local scalar variables successfully.
As I see vectorisation and parallelisation as sort of compiler optimisations that should not require the code to be altered, maybe with the exception of "hints", pragmas, etc., I would like to not add more lines of code than absolutely necessary to make it work.
Furthermore, I think that there is no vector dependence that I can see, so the compiler should vectorize. Especially, if I state the #pragma ivdep, shouldn't it?
At least it does not report that the dependence is proven.
Any idea how to get the compiler to honour the ivdep pragma? Or am I missing the dependency?
As you saw in the other thread, I have tried the local scalar variables successfully.
As I see vectorisation and parallelisation as sort of compiler optimisations that should not require the code to be altered, maybe with the exception of "hints", pragmas, etc., I would like to not add more lines of code than absolutely necessary to make it work.
Furthermore, I think that there is no vector dependence that I can see, so the compiler should vectorize. Especially, if I state the #pragma ivdep, shouldn't it?
At least it does not report that the dependence is proven.
Any idea how to get the compiler to honour the ivdep pragma? Or am I missing the dependency?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It certainly appears that *restrict should take care of the dependencies which otherwise would be assumed. However, summing into scalars often results in better code, and would avoid any question of dependencies, even without ivdep or restrict.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page