Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29396 Discussions

OpenMP directive prevents vectorization

Mike_Rezny
Novice
1,003 Views

Hi,
I think I am seeing a problem on 11.1.038 that is not on showing up 11.0.074 on my Linux system.

I have a simple matrix vector multiply routine where the inner loop vectorizes on both versions of the compiler.
However if I add a worksharing directive for the outer loop, 11.0.074 vectorizes the inner loop, but 11.1.038 reports many vector dependencies.

Is this a bug or am I doing something wrong?

Here is the source for the subroutine:

[plain]subroutine mxv(rows, cols, a, B, c)
use omp_lib
implicit none
integer (kind=4), intent(in) :: rows, cols
real (kind=8), intent(in)  :: B(cols,rows)
real (kind=8), intent(in)  :: c(cols)
real (kind=8), intent(out) :: a(rows)

integer (kind=4) :: row, col

!$OMP parallel do default(none) private(row,col) shared(rows,cols,a,B,c)
do row = 1, rows
   a(row) = 0.0
        do col = 1, cols
                a(row) = a(row) + B(col,row) * c(col)
        end do
end do

end subroutine
[/plain]
[plain]Here are the compiler options and output for 11.0.074[/plain]
[plain]






worm:~/tests/UsingOpenmp/Chapter3> which ifort
/opt/intel/Compiler/11.0/074/bin/intel64/ifort
worm:~/tests/UsingOpenmp/Chapter3> ifort -xhost -vec-report3 -openmp -openmp-report -c mxv.f90
mxv.f90(11): (col. 7) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
mxv.f90(12): (col. 1) remark: loop was not vectorized: not inner loop.
mxv.f90(16): (col. 2) remark: LOOP WAS VECTORIZED.
[/plain]
[plain]











[/plain]
And hereare the compiler options and output from 11.1.038

worm:~/tests/UsingOpenmp/Chapter3> which ifort
/opt/intel/Compiler/11.1/038/bin/intel64/ifort
worm:~/tests/UsingOpenmp/Chapter3> ifort -xhost -vec-report3 -openmp -openmp-report -c mxv.f90
mxv.f90(11): (col. 7) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.
mxv.f90(12): (col. 1) remark: loop was not vectorized: not inner loop.
mxv.f90(16): (col. 2) remark: loop was not vectorized: existence of vector dependence.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between a line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between a line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and b line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between b line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed FLOW dependence between a line 15 and c line 15.
mxv.f90(15): (col. 3) remark: vector dependence: assumed ANTI dependence between c line 15 and a line 15.

regards
Mike
0 Kudos
6 Replies
jimdempseyatthecove
Honored Contributor III
1,003 Views

The report says the inner loop was vectorized.
0 Kudos
TimP
Honored Contributor III
1,003 Views

The report says the inner loop was vectorized.
It does look like a regression. I've reported some myself where recent compiler versions didn't vectorize inside a parallel do. It would be clearer if you would write the inner loop as dot_product, so as not to depend on the details of anti-aliasing analysis.
0 Kudos
Mike_Rezny
Novice
1,003 Views

The report says the inner loop was vectorized.

Hi Jim,

Yes,I agree,the first report, from 11.0.074 says the inner loop vectorized.
But the second report from 11.1.038 says both loop swere not vecorized:

mxv.f90(12): (col. 1) remark: loop was not vectorized: not inner loop.
mxv.f90(16): (col. 2) remark: loop was not vectorized: existence of vector dependence.

Or, am I missing something?

regards
Mike
0 Kudos
Mike_Rezny
Novice
1,003 Views
Quoting - tim18
It does look like a regression. I've reported some myself where recent compiler versions didn't vectorize inside a parallel do. It would be clearer if you would write the inner loop as dot_product, so as not to depend on the details of anti-aliasing analysis.

Hi Tim,
I am unclear on what you are suggesting. I thought that this was a dot product:

do col = 1, cols
a(row) = a(row) + B(col,row) * c(col)
end do

Could you please explain what is anti-aliasing analysis or point me at some documentation. I would like
to know more about this please.

regards
Mike
0 Kudos
TimP
Honored Contributor III
1,003 Views
Mike,
I'm just pointing out that if you intend
a(row) = dot_product(B(:cols,row),c(:cols))
in this form the compiler need not be concerned about aliasing between a and B or c. So, if it still fails to vectorize, the comment from vec_report would have to be something other than aliasing problems (where modifying an element of a might change B or c). It's true, if you don't set -assume dummy_aliases, or some equivalent option which tells the compiler not to assume your code obeys the standard, the compiler shouldn't be showing these concerns about dependencies, so you probably have enough evidence of a regression already.
0 Kudos
Mike_Rezny
Novice
1,003 Views
Quoting - tim18
Mike,
I'm just pointing out that if you intend
a(row) = dot_product(B(:cols,row),c(:cols))
in this form the compiler need not be concerned about aliasing between a and B or c. So, if it still fails to vectorize, the comment from vec_report would have to be something other than aliasing problems (where modifying an element of a might change B or c). It's true, if you don't set -assume dummy_aliases, or some equivalent option which tells the compiler not to assume your code obeys the standard, the compiler shouldn't be showing these concerns about dependencies, so you probably have enough evidence of a regression already.
Hi Tim,
thanks for the information. Much appreciated.

Ok, so I replaced the inner loop with the intrinsic function dot_product(B:cols,row),c(:cols)
11.1.038 now does not report vector dependencies.

Also, as expected, there are no reported vector dependencies when the subroutine is compiled with -fno-alias.

This has been really helpful.

regards
Mike
0 Kudos
Reply