We have a function which runs within a large code-base. When we compile with gcc and link with g++, all the code works great. When we compile and link with icpc and -O1, or -O0, it works great, but with -O2 it fails. I have isolated the issue to be this specific function..., and if I put
#pragma optimize("", no),
then the code runs fine with -O2. We aren't sure what optimiziation the compiler is doing to cause this issue. Here is the function for reference.
int Deinterleaver (
const double_complex inBuf, // I: input vector representing interleaved data
int inLen, // I: input length (should be multiple of K)
const int K, // I: associated interleaver block size
double_complex outBuf // O: output vector representing deinterleaved data
const int *interleaver = getInterleaver(K);
if (! interleaver) return 0; // check for no inteleaver
const int Nblocks = inLen / K;
for (int block = 0; block < Nblocks; ++block)
for (int i = 0; i < K; ++i)
interleaverIdx = interleaver;
outBuf[interleaverIdx] = inBuf;
// prepare for next block
inBuf += K;
outBuf += K;
return Nblocks * K;
I've also attached the optimization report differences for the function in question between compiling with -O2 and leaving Deinterleaver to be optimized, and with -O2 and #pragma optimize("",off) for the Deinterleaver function. As you can see in the optimization snippets I've attached, there seems to be some sort of reversal of order...
Any help is greatly appreciated.
Thanks for responding so quickly. I don't think I did a good job of explaining what happens when the optimization for that function is turned on or off. When Optimization is turned on by commenting out the #pragma line, the program does not crash, it runs fine it just doesnt produce expected results. When Optimization is turned off, the program produces expected results.
I put some debug print statements and attached the input and output to the function. The input is at the top, titled trafficChannelFrame[1:384], and the output is about 384 lines below, which says deinterleavedData[1:384]. You can see that without optimization, deinterleaving is having an affect on the input data, but with optimization no deinterleaving is being performed.
I should add that we use 2 different instances of the deinterleaver. We use a block size of 128 and 384, and the 128 works all the time (with and without pragma).
I understand. I can try to create a wrapper that isolates this problem at some point, and then I will post back on this forum. But it may be a while.
Thanks for your time,