Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.

Niggling optimization problem

tjmor
Beginner
647 Views

Hello all,

I have some string and vector classes with an overloaded operator= that accepts negative indices which mean "index from the end of the string", ala Python:

inline

char String::operator = (int pos) const

{

if (pos < 0)

pos += len;

return data[pos];

}

I was hoping that when it was used in loops like this:

for (int i = 0; i < 1000; ++i)

if (s == 'X')

s = 'x';

the compiler would work out that 'i' could never be negative and eliminate the negative check in the inlined version of the operator function. Unfortunately, no matter what I do the check is always generated. I know this is a small problem but it's niggling at me smiley [:-)] Can anyone give any insight into why this happens and if it's possible to get rid of the check?

0 Kudos
13 Replies
JenniferJ
Moderator
648 Views
did you use -fast option or -O2 -QxT -Qipo?
0 Kudos
tjmor
Beginner
648 Views
Yes, compiling with /O3 and everything cranked up to the max. I suppose you can reduce the problem to something like:
for (int i = 0; i < 1000; ++i) {
if (i < 0)
printf("impossible");
}

On my setup code is generated for this loop and 'i' is compared with 0 in each iteration. Interestingly if '1000' is changed so that the loop has at most 12 iterations it's eliminated entirely. I think think this corresponds to it being unrolled as it seems to be possible to make the same happen for larger loops using #pragma unroll(255).


0 Kudos
JenniferJ
Moderator
648 Views

"printf()" is not a good example. it turns off the optimization.

I don't see the problem with following code with "icl -O3 -FA -c t.cpp". Maybe you're using an old compiler. The latest is 10.1.013.


class mystr
{
public:
mystr()
{
len = 10;
}
virtual ~mystr() {}
inline char operator = (int pos) const
{
if (pos < 0)
pos += len;
return data[pos];
}
inline char operator [] (int pos) const
{
return data[pos];
}
inline char & operator [] (int pos)
{
return data[pos];
}
private:
char data[2000];
int len;
};

void bar()
{
mystr s;

for (int i = 0; i < 1000; ++i)
if (s == 'X')
s = 'x';
}

int foo()
{
int ret = 0;
for (int i = 0; i < 1000; ++i)
{
if (i < 0)
ret = 0;
else
ret += i;
}
return i;
}

0 Kudos
tjmor
Beginner
648 Views
Thanks for taking the time to test this. After rereading my initial post I realize I wrote operator= when I meant operator[]. Sorry smiley [:-)] Nevertheless the problem is still there.

Below is some test code in terms of "mystr". With the check inside operator [], the compiler emits the tests inside the bar() loop on my system. They show up as "test reg, reg" followed by a conditional jump - easier to see with unrolling turned off.

class mystr
{
public:
mystr()
{
len = 10;
}

inline char operator [] (int pos) const
{
if (pos < 0)
pos += len;
return data[pos];
}

inline char &operator [] (int pos)
{
if (pos < 0)
pos += len;
return data[pos];
}

private:

char data[2000];
int len;
};

void bar()
{
mystr s;
for (int i = 0; i < 1000; ++i)
if (s == 'X')
s = 'x';
}

0 Kudos
JenniferJ
Moderator
648 Views

Yes, with this testcase, it still checks for "pos < 0".

I'll file a tracker on this. When there's a fix, I'll post the news here.

0 Kudos
tjmor
Beginner
648 Views
Thank you. I suppose it's worth noting that both gcc4 and vc9 do this too - that's why I wondered whether there might be some subtle reason the compiler can't assume 'i' is always positive.

0 Kudos
jimdempseyatthecove
Honored Contributor III
648 Views

TJ

Have you considered using a computational method to produce the index and thus eliminate the if test?

Assume int is 32 bits and len is never more than 16 bits

return data[pos+((pos>>16)&len)];

Use a slightly different statemtent for 64-bit int.

This eliminates the branch around the negative indexing. Note, if the compiler optimization is smart enough to know that (pos>>16) == 0 then it should remove the +((pos>>16)&len) from the expression.

If len is permitted to exceed 16 bits then there are other ways to quicklyflood the & mask with 1's when pos is negative (e.g. >>31).

Jim Dempsey

0 Kudos
tjmor
Beginner
648 Views
Thanks for your suggestion. I tried removing the branch with
pos += len & (pos >> 31);
and found that it was slower than the version with the branch, perhaps because the branch is correctly predicted 100% of the time when the sign of the index doesn't change. In fact, the overhead of the branch being there is probably about zero, but it does prevent any loop using this overloaded operator to access characters from being vectorized.

0 Kudos
jimdempseyatthecove
Honored Contributor III
648 Views

TJ,

You do not want the pos+=. Just the simple expression inside the []

inline char operator = (int pos) const
{
return data[pos+((pos>>16)&len)];
}
or something like:
inline char operator = (int pos) const
{
return data[pos+((((short*)(&pos))[1])&len)];
}
But that won't work if pos is an expression
The above will require than len < 16 bits
else use the following
inline char operator = (int pos) const
{
return data[pos+((pos>>31)&len)];
}
Jim
0 Kudos
jimdempseyatthecove
Honored Contributor III
648 Views

I forgot to mention. Sometimes the triglyph produces good code

return data[pos<0 ? len-pos : pos];

The higher level processors have the CMOVcc instruction and often the triglyph will use that now.

Jim

0 Kudos
JenniferJ
Moderator
648 Views

the problem related to "if (post <0)" extra-checking is fixed. It will be in the next major release.

Let me check all the other similar issues posted later to see if it's fixed as well.

0 Kudos
tjmor
Beginner
648 Views
Just a note to say I re-tested the above code on 10.1.013 Linux and the test is still generated -- though I don't know if by major release you mean 11.0.

0 Kudos
JenniferJ
Moderator
648 Views
Yes, next major release is 11.0. I should have been more clear on this.
0 Kudos
Reply