Could floating point optimizations degrade performance?

bigman · ‎08-18-2005

I'm comparing a fortran program to a C++ program that is supposed to do the same thing.

In the meantime I found that Microsoft's C++ compiler may issue instructions for reading or writing the coprocessor's 80-bit registers from or to memory respectively. This is done for strict ANSI C++ conformance and a program compiled this way may be slower (and even larger).

Does the Intel Fortran compiler version 8.0 performs any floating point optimizations that may degrade performance?

Intel_C_Intel · ‎08-18-2005

Hello,

Regarding floating point precision and performance, there are a number of options that may be important, like:

1. /fpconstant ensures that all constants and intermediate results are double precision (SSE2 64 bit). In my experience this option does not slow down a program.

2. /fltconsistency ensures that a floating point number stored in a register will have the same precisision as a number stored in memory. If you specify this option, you may experience that the program is running a little slower as floating point numbers may be put back to memory quite often.

If you are making code for the SSE2 instruction set (/QxN, /QxP option) the maximum precision is 64 bit, and (in my experience), it does not help so much to apply the second option (/fltconsistency) as both memory and registers are 64 bit. However, if you are using the old x87 math co-proccessor (/QxK option), the option may be required as this processor may store the numbers in 80 bit register to gain high accuracy and there may thus be a discrapancy between the accuracy of the CPU and the memory.

My recommendation is to generate code for the SSE2 instruction set and use the first option (/fpconstant) to ensure that all calculations are 64 bit double precision, as this may give overall very good precision if accuracy is requitred. A nice feature of this approach is that the numers does not loose precision when they are stored in memory. Also the SSE2 ionstruction sety may be significantly faster than the old x87 instruction set.

Lars Petter

Intel_C_Intel · ‎08-20-2005

Hello

Actually /fpconstan option affects only on constants not on a intermediate calculations (if believe Users Guide). For the last one I would recommend new /fp:double/extended/source options.
/fitconsistency disables inlining of math library functions. This option causes performance degradation relative to using default floating-point optimization flags. On Windows systems, an alternative is to use the /Qprec option, which should provide better than default floating-point precision while still delivering good floating-point performance.
What about constants I always use constant to indicate which precision uses. For example,

x = 0.1_R_ ! R_ - is kind-constant

You shouldnt do it with all constants in expressions:

x = 0.2_R_ + 0.01_R_ / 0.1_R_

is the same as

x = 0.2 + 0.01 / 0.1_R_

There are few another options: /Op for conforming to IEEE and ANSI FP-standards, it restricts optimizations, and /Oprec to improve precision but with less perfomance impact. Anyway by default /fp:fast option is set to allow aggressive optimizations at the expense of accuracy.

I experienced that improving precision often decreases performance, so you decide: /Oprec (or /fp:double/extended) vs. /fp:fast. By default performance rules. /Op probably uses for interoperability. Set /fp:precise to allow only value-safe optimizations.

I recommend to read brochure Optimizing Applications with the Intel C++ and Fortran Compilers to find out more information about this. I used it here too. Unfortunately, something is going on with www.intel.com, so I cant give you a link. You may find it on Technical Information section for IFC. There is another Quick-Reference Guide to Optimization with Intel Compilers guide near there.