Numerical differences between same code compiled with /Qopenmp and without.

griflet · ‎01-27-2011

Hi,

I'm parallelizing MOHID, a cfd scientific software, with openmp in VS 2008. I use intel fortran 11.1.

However, if I comment every single !$OMP directive (by doing a replace in solution with !!$OMP), compile and run, I get a difference in the results between the code compiled in double precision with /Qopenmp and without.Since every !$OMP directive is commented, I'm certain that there isn't a racing condition. Furthermore, the difference is exactly the same with "set OMP_NUM_THREADS=1,2,3,4,5,6,7,8".

The difference starts showing up, after some time during the main time iteration, in the last decimal digit and slowly continues creeping up to higher and higher decimal digits as the code iterates.

Is this expectable? I was expecting to get exactly the same values in the results (up to the last decimal digit) with /Qopenmp and without, when the !$OMP directives are commented.

I'm trying to compile and run the same test-case in linux with ifort 12.0.0 to see if I get the same behaviour...

Anyone else has any ideas?

Thank you,

Guillaume

TimP · ‎01-27-2011

Numerical differences (even from run to run) are expected with reduction operations. Major applications provide a user option to skip reductions (at a cost in performance).
Certain optimizations which may have numerical effects, such as loop nest optimizations performed by ifort -O3, are inhibited by -openmp.
You would of course set some compatibility options such as -assume protect_parens -prec-div -prec-sqrt (and, with 12.0, -imf-arch-consistency) if you are concerned about these numerical differences.

jimdempseyatthecove · ‎01-27-2011

You may be experiencing a compiler optimization issue where:

with !!OMP$...(as comment)optimizations cross the comment

with !OMP$...(as parallel region) optimizations do NOT cross the region.

The code difference is likely introducing a rounding difference.

If you want consistency, then I suggest compiling with OpenMP enabled always and using an external (not visible to compiler) variable for use in setting the number of OpenMP threads.

Then configure for 1 thread for your serial base line.

You will induce a small amount of overhead in your serial baseline. And this may be the price you pay for consistency in results.

Jim Dempsey

Wendy_Doerner__Intel · ‎01-27-2011

Guillaume,

We recommend the following switch on Windows when consistency of floating point results is important:

/fp:precise /fp:source

More details:

http://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/

------

Wendy

Attaching or including files in a post