Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring

vectorization for-loop with data dependency

XinWu
Beginner
287 Views

Hi everyone,

I tried the following simple for-loop with data dependency,

#pragma omp simd

for (i = 1; i < 256; ++i) a[i] = 3.125 * a[i-1];

Using icc with the options (-xCORE-AVX512 -qopt-zmm-usage=high -qopenmp-simd) on Skylake-SP CPU, it seems this for-loop can be vectorized, because instructions vmovups and vmulps are used for data read/write and multiplication, respectively.

Therefore vectorization may still be possible for some loops with data dependency. Am I correct?

Thank you in advance!

0 Kudos
1 Reply
XinWu
Beginner
273 Views

I found the problem.

The compiler may generate vectorized instructions (e.g. vmovups and vmulps) for loop with data dependency, but the calculated numerical results are complete wrong.

Reply