Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Loop Vectorization

r_guidi
Beginner
576 Views
I have a questionabout vectorization. I compile the following instruction on a win 32 platform
for(k=K_start;k{
Hz=VAP1*Hz+VAP2*VAP3/DY_j-VAP3*VAP4/DX_i+VAP5;
}

and in the report file I can found:

kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED

However for the same codein x64 platform with the same setting the report filesaid:

kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence

How is it possible ?
0 Kudos
1 Solution
TimP
Honored Contributor III
576 Views
Quoting - r_guidi

I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).

Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?

The same option is spelled /Qansi-alias for Windows, or -ansi-alias for linux.
It's equivalent to the gcc default of -strict-aliasing (assertion that your program complies with the C or C++ standard about disallowing aliasing of incompatible types).
Not setting this option is likely to be an impediment to optimization of multiple subscripts.
Microsoft C++ option /Oa includes the same effect, but asserts also that your code complies with Fortran-like rules about aliasing of function parameters.
I agree, I'm surprised to find it so difficult to navigate the Visual Studio properties settings.

View solution in original post

0 Kudos
7 Replies
Thomas_W_Intel
Employee
576 Views
The compiler needs to be very conservative in determining if there is a dependency. In fact, the compiler cannot vectorize the code unless it can prove that there is no dependency. Unfortunately, there are many things that may lead to a dependency, but are difficult or impossible for the compiler to prove wrong. The possibility of two overlapping arrays that are given as function arguments are such an example.

If anything changes in your built process by switching from 32 to 64 bit, it might prevent the compiler from drawing its conclusion. This might not only be differences in source code, e.g. by ifdefs. Changes of the data or code size can also affect which parts of the code are inlined, unrolled, and so on.

If you are sure that there are no dependencies, you can take over the responsibility from the compiler and add
#pragma ivdep
to your code. The compiler should then vectorize the loop in both cases.

Kind regards
Thomas
0 Kudos
TimP
Honored Contributor III
576 Views
In the case presented, it seems that restrict qualifiers e.g.
float * restrict ..... ought to be sufficient, along with setting -ansi-alias.
0 Kudos
TimP
Honored Contributor III
576 Views
Quoting - tim18
This could happen if -ansi-alias were set in one case, but not the other.

0 Kudos
r_guidi
Beginner
576 Views
Quoting - r_guidi
I have a questionabout vectorization. I compile the following instruction on a win 32 platform
for(k=K_start;k{
Hz=VAP1*Hz+VAP2*VAP3/DY_j-VAP3*VAP4/DX_i+VAP5;
}

and in the report file I can found:

kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED

However for the same codein x64 platform with the same setting the report filesaid:

kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence

How is it possible ?

0 Kudos
r_guidi
Beginner
576 Views
thanks for the answers.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.

I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).

Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.
0 Kudos
TimP
Honored Contributor III
577 Views
Quoting - r_guidi

I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).

Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?

The same option is spelled /Qansi-alias for Windows, or -ansi-alias for linux.
It's equivalent to the gcc default of -strict-aliasing (assertion that your program complies with the C or C++ standard about disallowing aliasing of incompatible types).
Not setting this option is likely to be an impediment to optimization of multiple subscripts.
Microsoft C++ option /Oa includes the same effect, but asserts also that your code complies with Fortran-like rules about aliasing of function parameters.
I agree, I'm surprised to find it so difficult to navigate the Visual Studio properties settings.
0 Kudos
jimdempseyatthecove
Honored Contributor III
576 Views
Quoting - r_guidi
thanks for the answers.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.

I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).

Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.

What are the types of i, j, and k?
Can you experiment by changing the types to all the same as "int" then as "intptr_t".
sizeof variables are different especialy with respect to expectations of what size_t is related to.

Jim Dempsey
0 Kudos
Reply