Intel® C++ Compiler
Community support and assistance for creating C++ code that runs on platforms based on Intel® processors.
7939 Discussions

compiler option O0 and O2 generate different results

Robin
Beginner
276 Views

Hi all,

I am optimizing my code. When I use O2 compiler option, the code runs repeatedly two times, the results are different in each repeat. But when I change to O0 compiler option, the results are the same. What might be the reason?

BTW, I am using CentOS 6.5 and the command line is as follows:

icpc -O2 -DALIGN_OPT=16 -DSSE -ipo main_turbo_decoding_cfunc.cpp turbo_decoding_cfunc.cpp -o turbo_decoding -lrt

Thank you!

0 Kudos
3 Replies
TimP
Honored Contributor III
276 Views

If you use 32 bit mode, run to run variations due to changing data alignment with vector reductions are likely. Your defines seem to indicate some awareness of this.  Compiler options like -fp-model source remove optimization of this nature.

uninitialized data are another possibility.

0 Kudos
Robin
Beginner
276 Views

Hi Prince

Thank you for the comments.

The -fp-model is used for float point calculation, right? But I am using fixed point calculation.

I compare the results by printing the internal variable by "printf" for continuous repeat. And I find O0 resutls are the same in each repeat, but O2 resutls are different. Is there any optimization of O2 on printf?

Tim Prince wrote:

If you use 32 bit mode, run to run variations due to changing data alignment with vector reductions are likely. Your defines seem to indicate some awareness of this.  Compiler options like -fp-model source remove optimization of this nature.

uninitialized data are another possibility.

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
276 Views

I'm with Tim's suspicion that you are using uninitialized data (or unintended data) for input and the value unexpectedly changes between runs. Note, this can also include unintended output where in the O0 case the location does not cause the error to be observed.

Also, if your code is parallel code, you may have a non-deterministic algorithm that is "sticky" in O0 but not in O2.

Jim Dempsey

0 Kudos
Reply