- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just in case someone else is having the same problem, as I didnt' spot the speciffic work-arounf in the documentation (which is not to claim that it's not there, I just didn't spot it when I was looking for a work-around).
Pentium III cannot vectorize (int * float) operations. It CAN, however, vectorize (float * float). Thus:
================================
// This cannot be vectorized:
int x;
int DataC = 10;
float DataV[DataC];
float Multiplier;
for (x = 0; x < DataC; x++)
{
Data = x * Multiplier;
}
================================
// This CAN be vectorized:
int x;
int DataC = 10;
float DataV[DataC];
float Multiplier;
float xx; // Pentium III Vectorization Hack
xx=0;
for (x = 0; x < DataC; x++)
{
Data = xx * Multiplier;
xx++;
}
I hope this is useful to someone. It certainly makes a big difference in my test program.
Pentium III cannot vectorize (int * float) operations. It CAN, however, vectorize (float * float). Thus:
================================
// This cannot be vectorized:
int x;
int DataC = 10;
float DataV[DataC];
float Multiplier;
for (x = 0; x < DataC; x++)
{
Data
}
================================
// This CAN be vectorized:
int x;
int DataC = 10;
float DataV[DataC];
float Multiplier;
float xx; // Pentium III Vectorization Hack
xx=0;
for (x = 0; x < DataC; x++)
{
Data
xx++;
}
I hope this is useful to someone. It certainly makes a big difference in my test program.
Link Copied
3 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Oh dear, replying to my own post. :-/
This "hack" only seems to work on ICC v9. ICC v10 still fails to vectorize, this time not because it's an unsupported operation (i.e. int * float on a P3), but because it thinks that xx is vector dependant on (unknown)...
This "hack" only seems to work on ICC v9. ICC v10 still fails to vectorize, this time not because it's an unsupported operation (i.e. int * float on a P3), but because it thinks that xx is vector dependant on (unknown)...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can add #pragma ivdep in front of the loop.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
That wouldn't help in this case. My understanding of why the loop doesn't get vectorized is that SSE only supports float/float operations. Thus, int*float cannot be vectorized. Casting an int to a float on every pass is inefficient, but just keeping a float version of the iterator seems to work quite well.
Of course, on P4 or later x86 processors it's irrelevant because SSE2 and later supports int * float operations.
Of course, on P4 or later x86 processors it's irrelevant because SSE2 and later supports int * float operations.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page