Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7956 Discussions

Loop Reversal : Optimization problem with -O default option

djambhale
Beginner
397 Views
Hi ,

We are getting a stange problem due to compiler optimization with the default optimization level : O == O2 also on the following mentioned configuration.

OS : Linux 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:29:47 EST 2005 x86_64 x86_64 x86_64 GNU/Linux
Compiler : Intel C++ Compiler for applications running on Intel 64, Version 10.0 Build 20070426 Package ID: l_cc_p_10.0.023
Library version :
Compiler flags used :
-O -c -D_REENTRANT -DVS_USE_64_CALLS -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -fpic -Wp64 -Wall -nolib-inline -fno-exceptions


We have a while loop of the following type :
===========
while (ct != 0)
{
--ct;
ASSERT((shiftPtr - 1)->gap_start < (shiftPtr - 1)->gap_end,
"sl_info_addGap: forward shifted gap is bad");
*shiftPtr = *(shiftPtr - 1);
--shiftPtr;
//syslog (LOG_ERR," ct = %d ", ct );
}

===========

The *shiftPtr is a pointer to an array of structures and in this loop we are intending to shift the array contents one place down the array, but starting from the last location in the array. ie: last = prev
in order to insert something.

The compiler is doing a LOOP Reversal optimiztion as seen in the optimization report. Due to the loop reversal, the values that are getting assigned are entirely goofed up as demonstrated below :

We intend the following to happen :

array[4] = arrary [3];
array[3] = arrary [2];
array[2] = arrary [1];

But the loop reversal is doing :
array[2] = arrary [1];
array[3] = arrary [2];
array[4] = arrary [3];

and so all the values ie. arr[4] , 3 , 2 are getting the value of arr[1] , and leading to further serious SEGVs in ouur code.

We have a workaround for the problem . In the following 2 workarounds, the compiler does not do the LOOP REVERSAL optimization as seen from the output report. And hence the problem goes away.
1. We put a syslog/printf : as commnted in the above code.
2. If a pragma :
#pragma novector

is used ahead of our while loop , then the loop reversal is not done.


But we would like to know if there is some robust way/flag/switch/way to modify the loop code, so that the result becomes predictable.
Any suggestions are welcome.

Thanks
Arati
0 Kudos
2 Replies
TimP
Honored Contributor III
397 Views
As you already showed, the problem appeared to be in the vectorization. You should file a problem report, preferably after checking that the problem still exists in a recent compiler, meanwhile using the #pragma novector. The compiler shouldn't reverse such a loop unless you used an option such as #pragma ivdep or vector always which would allow it to proceed without proving no dependencies. I don't believe the compiler currently is capable of vectorizing this loop correctly, except by using a temporary copy.

0 Kudos
Dale_S_Intel
Employee
397 Views
I can't reproduce the problem with the snippet you've provided (i.e. when I construct a runnable test case with this loop it works fine using the same compiler you're using). I notice you're using an older compiler. I'd recommend you try the latest 10.1 compiler, and if you still see the problem either post a runnable test case that illustrates the problem or submit the issue to http://premier.intel.com.

Good luck!

Dale

0 Kudos
Reply