- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am optimizing my code. When I use O2 compiler option, the code runs repeatedly two times, the results are different in each repeat. But when I change to O0 compiler option, the results are the same. What might be the reason?
BTW, I am using CentOS 6.5 and the command line is as follows:
icpc -O2 -DALIGN_OPT=16 -DSSE -ipo main_turbo_decoding_cfunc.cpp turbo_decoding_cfunc.cpp -o turbo_decoding -lrt
Thank you!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you use 32 bit mode, run to run variations due to changing data alignment with vector reductions are likely. Your defines seem to indicate some awareness of this. Compiler options like -fp-model source remove optimization of this nature.
uninitialized data are another possibility.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Prince
Thank you for the comments.
The -fp-model is used for float point calculation, right? But I am using fixed point calculation.
I compare the results by printing the internal variable by "printf" for continuous repeat. And I find O0 resutls are the same in each repeat, but O2 resutls are different. Is there any optimization of O2 on printf?
Tim Prince wrote:
If you use 32 bit mode, run to run variations due to changing data alignment with vector reductions are likely. Your defines seem to indicate some awareness of this. Compiler options like -fp-model source remove optimization of this nature.
uninitialized data are another possibility.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm with Tim's suspicion that you are using uninitialized data (or unintended data) for input and the value unexpectedly changes between runs. Note, this can also include unintended output where in the O0 case the location does not cause the error to be observed.
Also, if your code is parallel code, you may have a non-deterministic algorithm that is "sticky" in O0 but not in O2.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page