Different results on Intel and AMD architectures

Arnoux__Felix · ‎02-02-2011

Hello,

I'm currently working on a project that requires very stable results on different types of platforms (AMD or Intel).

I'm using IFC 11.0 (Intel Fortran Compiler Professional for applications running on IA-32, Version 11.0 Build 20090131 Package ID: l_cprof_p_11.0.081).

I was wondering which compiler options I should use, considering that the program carries on a lot of floating point operations ?

So far, I was using the following set of options:

-O2 -zero -save -real-size 64 -mieee-fp -fpconstant -pad -convert big_endian -fpe0 -static-intel -stand f95

Thanks !

Ron_Green · ‎02-02-2011

with your older 11.0 compiler, I would use -O0 -nolib-inline. It's unrealistic to expect O2 to produce the same results on different processor architectures. there are numerous threads in this forum discussing this topic.

12.0 has a new set of libraries that would help:

-fimf-precision=

Default is off (compiler chooses)

Typically high for scalar code, medium for vector code

low typically halves the number of mantissa bits

high ~0.55 ulp; medium < 4 ulp (typically 2)

-fimf-arch-consistency=

Will produce consistent results on all microarchitectures or processors within the same architecture

Run-time performance may decrease

Default is false (even with fp-model precise !)

Windows form:

/Qimf-precision:medium

/Qimf-arch-consistency:true etc

timintel · ‎02-02-2011

ifort -help doesn't explain the arguments for imf-arch-consistency, but it's important for avoiding problems with math library for AMD. The html documentation agrees with Martyn's explanation. I don't understand why anyone would take the trouble to use this option if they didn't want the true selection.
High precision is not necessarily available for all vector math functions (which you won't invoke anyway with the old compiler and no SSE optimization), but the consistency option will avoid problems associated with switching between CPU types, with little affect on performance.
If you are willing to select SSE2 or SSE3, I would suggest also setting -prec-div -prec-sqrt -assume protect_parens (all of those included in -fp-model source and -fp-model precise) as the instructions involved in no-prec-div and no-prec-sqrt are implemented with differing accuracy on various CPUs (besides not working over the full numerical range). I believe those options have been present since ifort 10.1.
If you are interested in -fpe0, you should review other posts about it on the forum. I don't see it as contributing to the goals you have expressed.
Likewise, while -zero and -save could paper over some deficiencies of past legacy programming practice, they aren't entirely reliable, and will restrict you from use of multi-thread programming.
-mieee-fp isn't implemented in ifort, as far as I know; the options we have suggested should have equivalent effect.
A program which depends on fpconstant is not in compliance with Fortran standard; ideally you should fix that too. It shouldn't be needed anyway, with the real-size option.