Compiler switch -- significant difference in execution time?

lklawrie1 · ‎06-09-2005

We noticed that one of our files/features was taking significantly longer in the most recent release. I decided to try to track this down today.

It appears that switching between /fpe:0 (default -- floating-point exception, produce NaN) and /fpe:3 (underflow gives 0.0; abort on other IEEE exceptions) may be the culprit.

For the test file, went from 3+ minutes to under 1 minute (same system timing).

Has there been extensive illustrations done on impacts of various compiler switches on computation time? Is it published somewhere?

Linda

Steven_L_Intel1 · ‎06-09-2005

The documentation for the various swicthes does say, in most cases, where choosing certain options affects performance. Yes, /fpe:0 will hurt performance.

lklawrie1 · ‎06-09-2005

Ah yes, one of my least favorite features of the Intel compiler... you can't go to debug mode and expect the same "problem" to arise. (In this case a hard to track down NaN). Went to /fpe:0 -- all okay, one minor change (out of some 200 test files). Took off "fltconsistency" and have this pop up. Mostly wanted to see the speedier results over my full test suite. Have now spent most of afternoon trying to track down the problem in release mode by writing out successive debug statements (which sometimes, no change to code, allow the code to run successfully -- no NaN).

Any pointers on where to look?

Linda

TimP · ‎06-09-2005

Intel compilers are unexpectedly free about re-association when you remove options like /fltconsistency. First, however, if you had any reason for trying /fltconsistency, you should try /Qprec, which takes some of the precautions implied by /fltconsistency, without much effect on performance.

I was just bitten by the compilation of d/(a + (b-c)) as d/((a+b)-c), which is fixed by /fltconsistency. This may easily be a problem when using SSE code, in the case where b and c are nearly equal, and much larger than a. I have started a personal campaign (with problem reports) for /Qprec and the like to be as careful about parentheses as /fltconsistency is.

lklawrie1 · ‎06-11-2005

I was done in by a statement:

(numerator)/(x-y+smallnumber). The programmer had set smallnumber to 1.E-30 but x and y were equal. So, in that mode (/fpe:3), it kept dividing by zero. Unfortunately, it didn't crash in debug mode so was difficult to track down.

Would the compiler switch /Qfpstkchk help these situations by crashing closer to the site of the bad number occurrence? Does it significantly impact runtime speed?

Linda

TimP · ‎06-11-2005

If you turned on optimization with /O in your debug build, you should have seen the same behavior as without debug.
In your case, the programmer is at fault for not using parentheses. Unfortunately, ifort doesn't have any option to require their observance, other than /fltconsistency and its synonyms. Left to right evaluation is not required by Fortran.

lklawrie1 · ‎06-11-2005

I don't think parentheses are going to make a difference here. Unless the compiler/run time is going to evaluate (x-y) as zero and then add smallnumber? So, you would paren the denominator as ((x-y)+smallnumber)?

Does /Qfpstkchk help "stop" errors near where they occur without undue extra time added to run time? /fpe:3 more or less did that in CVF, without the extra run time addition. Obviously not in Intel compiler(s).

TimP · ‎06-12-2005

Yes, that's the programmer's intent. Take the difference, then add the small number, in case the difference is zero or extremely small. Parentheses should be used to express that, and you need a compiler option to perform the operations as specified. You found out that you got the required order with /fltconsistency.
As far as I know, checking for stack errors protects only against errors in function call argument declarations.