- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a questionabout vectorization. I compile the following instruction on a win 32 platform
for(k=K_start;k {
Hz=VAP1*Hz+VAP2*VAP3/DY_j-VAP3*VAP4/DX_i+VAP5;
}
and in the report file I can found:
kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED
However for the same codein x64 platform with the same setting the report filesaid:
kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence
How is it possible ?
for(k=K_start;k
Hz
}
and in the report file I can found:
kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED
However for the same codein x64 platform with the same setting the report filesaid:
kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence
How is it possible ?
1 Solution
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - r_guidi
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
It's equivalent to the gcc default of -strict-aliasing (assertion that your program complies with the C or C++ standard about disallowing aliasing of incompatible types).
Not setting this option is likely to be an impediment to optimization of multiple subscripts.
Microsoft C++ option /Oa includes the same effect, but asserts also that your code complies with Fortran-like rules about aliasing of function parameters.
I agree, I'm surprised to find it so difficult to navigate the Visual Studio properties settings.
Link Copied
7 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler needs to be very conservative in determining if there is a dependency. In fact, the compiler cannot vectorize the code unless it can prove that there is no dependency. Unfortunately, there are many things that may lead to a dependency, but are difficult or impossible for the compiler to prove wrong. The possibility of two overlapping arrays that are given as function arguments are such an example.
If anything changes in your built process by switching from 32 to 64 bit, it might prevent the compiler from drawing its conclusion. This might not only be differences in source code, e.g. by ifdefs. Changes of the data or code size can also affect which parts of the code are inlined, unrolled, and so on.
If you are sure that there are no dependencies, you can take over the responsibility from the compiler and add
#pragma ivdep
to your code. The compiler should then vectorize the loop in both cases.
Kind regards
Thomas
If anything changes in your built process by switching from 32 to 64 bit, it might prevent the compiler from drawing its conclusion. This might not only be differences in source code, e.g. by ifdefs. Changes of the data or code size can also affect which parts of the code are inlined, unrolled, and so on.
If you are sure that there are no dependencies, you can take over the responsibility from the compiler and add
#pragma ivdep
to your code. The compiler should then vectorize the loop in both cases.
Kind regards
Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
In the case presented, it seems that restrict qualifiers e.g.
float * restrict ..... ought to be sufficient, along with setting -ansi-alias.
float * restrict ..... ought to be sufficient, along with setting -ansi-alias.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - r_guidi
I have a questionabout vectorization. I compile the following instruction on a win 32 platform
for(k=K_start;k {
Hz=VAP1*Hz+VAP2*VAP3/DY_j-VAP3*VAP4/DX_i+VAP5;
}
and in the report file I can found:
kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED
However for the same codein x64 platform with the same setting the report filesaid:
kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence
How is it possible ?
for(k=K_start;k
Hz
}
and in the report file I can found:
kernel.c(1722:5-1722:5):VEC:_H_calc: LOOP WAS VECTORIZED
However for the same codein x64 platform with the same setting the report filesaid:
kernel.c(1721:5-1721:5):VEC:H_calc: loop was not vectorized: existence of vector dependence
How is it possible ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for the answers.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - r_guidi
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
It's equivalent to the gcc default of -strict-aliasing (assertion that your program complies with the C or C++ standard about disallowing aliasing of incompatible types).
Not setting this option is likely to be an impediment to optimization of multiple subscripts.
Microsoft C++ option /Oa includes the same effect, but asserts also that your code complies with Fortran-like rules about aliasing of function parameters.
I agree, I'm surprised to find it so difficult to navigate the Visual Studio properties settings.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - r_guidi
thanks for the answers.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.
I try using
#pragma ivdep
but the results is
VEC:H_calc: loop was not vectorized: dereference too complex.
I am compiling in under Windows, so -ansi-alias setting is not available, and the flag
Language > Recognize Restrict Keyword is the same for both plotafgorm (x64 and win32).
Guring some tests I noted that it seem to be the present of i and j index in addition to k to cause the problem.
Is it possible ?
However I do not yet undestand the difference behaviour between win and x64 platforms.
What are the types of i, j, and k?
Can you experiment by changing the types to all the same as "int" then as "intptr_t".
sizeof variables are different especialy with respect to expectations of what size_t is related to.
Jim Dempsey

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page