- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
maybe by using simd intrinsics https://software.intel.com/en-us/node/513376 if your compiler is recent enough that it uses SSE instructions at -O0. Evidently, there is no such facility under -mia32.
Changes in the default setting shouldn't be occurring without notification, but SandyBridge and newer CPUs were changed so that underflow in add/subtract is not expensive. The original reason for setting ftz for performance was supposed to be eliminated.
If your main() is compiled with gcc or msvc++, normal compile options would not give you ftz except by using the intrinsic. So the expected icc behavior is non-portable. As those compilers improved support for vectorization, the hardware needed change to fix the performance issue.
The article https://software.intel.com/en-us/node/513376 apparently forgot to mention the effect of -fp-model settings on default ftz setting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Tim P. wrote:So is the compiler also changing -fp-model when I use -O0 or what are you implying here? My post was mostly meant as a notifier, that the documentation of -ftz is not sufficient, because it fails to mention the dependency of -ftz on having optimization active.
The article https://software.intel.com/en-us/node/513376 apparently forgot to mention the effect of -fp-model settings on default ftz setting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't know whether that is a good way of putting it, but fp-model doesn't have normal effects at -O0. What I meant is that certain settings, such as -fp-model precise, imply -no-ftz, although that hasn't been consistent. That's another view of your assertion that -ftz isn't fully documented in any one place, and my assertion that modern CPUs would allow for it to be set by default in fewer contexts (provided, of course, that we are given sufficient warning of changes).
I've had to test whether -ftz interacts with -[no]prec-div -[no]prec-sqrt as well, seeing unexpected settings and results there, as it is a possible cause of NaN results.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try compiling your main with full optimizations and debug options (main will init the floating point mode), and compile everything else with debug (-O0). If necessary make a new main stub that calls your (renamed) main.
Jim Dempsey

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page