- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I tried the following simple for-loop with data dependency,
#pragma omp simd
for (i = 1; i < 256; ++i) a[i] = 3.125 * a[i-1];
Using icc with the options (-xCORE-AVX512 -qopt-zmm-usage=high -qopenmp-simd) on Skylake-SP CPU, it seems this for-loop can be vectorized, because instructions vmovups and vmulps are used for data read/write and multiplication, respectively.
Therefore vectorization may still be possible for some loops with data dependency. Am I correct?
Thank you in advance!
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found the problem.
The compiler may generate vectorized instructions (e.g. vmovups and vmulps) for loop with data dependency, but the calculated numerical results are complete wrong.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page