Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Fortran intrinsics - compiler optimization?

Heinzeller__Dom
Beginner
493 Views

I am having an issue with bit-for-bit identical results from the Fortran intrinsic gamma function.

In a source file that is identical, but used in two different versions of the model, calls are made to the Fortran intrinsic gamma function. In both cases, the source file is compiled without optimization (-O0 ). The input to the gamma function is bit-for-bit identical (checked by calculating an Adler checksum of the value to make sure the representation in memory is really correct). The output is slightly different:

X = 0.80000000000000000000000000000000000E+01

CHKSUM(X) = 2539951972

OUTPUT ONE: 0.50400000000000000000000000000000000E+04

OUTPUT TWO: 0.50399999999999927240423858165740967E+04

The compiler flags, apart from -O0, specify -fp-model source. Should I use -fp-model precise or consistent? I would like to obtain the value from OUTPUT ONE, since this is the value in the reference codebase.

Is there a possibility to find out why the results are different? Vectorization reports? Assembler output?

Also, in case two different implementations of the gamma function are used, is there a possibility to control which version?

Any insight is highly welcome.

Thanks,

Dom

0 Kudos
1 Solution
8 Replies
Heinzeller__Dom
Beginner
493 Views

Update: using -fp-model precise does not solve the issue, -fp-model consistent does.

I am overriding a few (default) optimization flags with -O0, is there any way to check which optimizations are used in the end? Adding "-qopt-report=5" doesn't say anything (because of -O0), but I am not sure about other flags such as -xCORE-AVX2. Do they get disabled if -O0 is used?

 

0 Kudos
Juergen_R_R
Valued Contributor I
493 Views

What is the kind type of your real variables that you are using? What output did you use, formatted, unformatted, etc.?

 

0 Kudos
Heinzeller__Dom
Beginner
493 Views

Thanks Juergen.

kind: real*8

The output is formatted output, because I know that this truncates the accuracy I calculate the checksum of the argument X using an Adler 32-bit hash algorithm. I can see that the output is really different, not just the formatted version, because the results of my code do change afterwards.

 

What I found thus far is that the differences go away when I add -fimf-arch-consistency=true (or -fp-model consistent, this switch contains -fimf-arch-consistency=true and a few more). But this also changes the results of the other calculations in the file. Ideally I only wanted the gamma function to produce the correct result in both codes.

0 Kudos
mecej4
Honored Contributor III
493 Views

Here are, I hope, a couple of stronger reasons to prefer OUTPUT ONE:

Since Γ(8) = 7! = 50400, the result is an exact integer, and that integer has an exact representation in IEEE-32 and IEEE-64 floating point formats. The results do agree to 15 decimal digits, but in this case we may reasonably expect 16 digits instead.

It may be useful for you to output the result in hexadecimal format, e.g., 1.0d0 is represented as 3FF0 0000 0000 0000. Doing so will let you know whether the internal representation is itself inaccurate or if the formatting in decimal representation introduced additional deviations.

I doubt that any change should be seen in the output result as a consequence of your selecting different compiler options that control FPU operations. The calculation of the gamma function is performed in the Fortran library that contains that intrinsic function. Unless your /fp options cause a different library to be used or change the control word (or MXCSR), there should be no change in the value of Γ(8) that is returned by the intrinsic function.

0 Kudos
Heinzeller__Dom
Beginner
493 Views

I can confirm that (a) the input to the gamma function is bit-for-bit identical and that (b) adding the flag -fimf-arch-consistency=true solves the bit-for-bit differences. There are articles out there that discuss that different versions of mathematical functions can get used, depending on which math library and compiler flags like the above. In my case, the bit-for-bit differences also disappear if I use static linking instead of dynamic linking to get my new/alternative code into the model.

0 Kudos
Steve_Lionel
Honored Contributor III
494 Views
0 Kudos
Heinzeller__Dom
Beginner
493 Views

Thanks Steve, I did know about some other references from Intel regarding bit-for-bit reproducibility, but not about your slides. Don't know if there is a way to flag this thread as "solved", but from my side it is because I can use "-fimf-arch-consistency=true" to get identical results.

0 Kudos
Steve_Lionel
Honored Contributor III
493 Views

There is no "solved" flag, but you can mark one of the replies as "best answer".

0 Kudos
Reply