AMD vs Intel numerics re-visited

Nick2 · ‎08-22-2008

Hello...

I cannot get my managers to let go of this one. Last time I tested, my code compiled with IVF was giving me slightly different answers on AMD vs Intel CPUs. We did not have this issue with the CVF. Has Intel done anything recentlyto remedy this problem? Here's a quick guide to my compiler settings....Thanks!

Nick

32-Bit

Fortran -> Optimizations -> Disable Optimizations (Release Configuration Only,mainstream versions of our code only). This option is due to 387 FPU, to be re-visited when compiling native 64-bit code.

Fortran -> Code Generation: Enable Recursive

Fortran -> Data: Initialize to Zero Yes

Fortran -> Floating point: FPE0 (crash on NaN)

Fortran -> Runtime: Generate Traceback, Check array and string bounds (this enables the DOS version to return the line and subroutine where the crash occurred).

Fortran -> Libraries: Ensure that Multithreaded is selected, not Multithreaded DLL; or, that Multithreaded Debug is selected, not Multithreaded Debug DLL.

Linker -> Enable Incremental Linking: No. (See Traceback above)

TimP · ‎08-22-2008

If you are interested in normal SSE2 results, the debug x87 results will not be relevant, unless they expose an outright error in your SSE2 results.

In order to avoid SSE single precision instructions which differ in numerical results among various families of AMD CPUs, you should set /Qprec-div /Qprec-sqrt. These options are included in /fp:precise or/fp:source. As the Intel CPU families introduced over the last year have excellent performance forIEEE accurate divide and sqrt, there is more reason now to use these options.

Those /fp options also prevent auto-vectorization optimizations where results vary slightly with data alignment, and those where math library functions differ slightly between Intel and AMD.

You must also take care to use the same /Qftz setting; those /fp options set /Qftz-, which you can undo by following them with /Qftz. You may want to test your application both with /Qftz and with /Qftz- (for compilation of the main program).

I suppose you must set some of these options under additional settings.

If your source code doesn't initialize data correctly, the Initialize to Zero can't be depended upon to avoid problems, including possible differences between Intel and AMD, as well as differences between debug and optimized mode. You would have had the same problem with CVF if you set threading compatible options.

Another step which you would require to maintain a correct comparison between CVF and ifort would be to set the float consistency option in CVF and /assume:protect_parens in ifort.

Steven_L_Intel1 · ‎08-22-2008

The reason you did not see this with CVF is that CVF had no support for SSE/SSE2 floating point. It always used the X87 instructions.