Software Archive
Read-only legacy content
17061 Discussions

Numeric difference by libimf.a with gcc compiler

Dahe_C_
Beginner
758 Views

I am using 5.3 gcc with libimf.a (have but can't use icc due to other reasons) and find that the exp() function returns different values for certain numbers when the same build is run on Sandy Bridge and Haswell running CentOS6. The library comes with pxtudioxe both 2016 and 2017. For example,

#include <math.h>
#include <stdio.h>

int
main()
{
    const double v = 3.3990833195703927;
    printf("exp(v) = %.20e\n", exp(v));
}

produces

2.99366451289085517828e+01

2.99366451289085482301e+01

on the two architectures. The code is compiled with "gcc main.cc .../libimf.a .../libirc.a". Is there a way to use the library with gcc such that it is architecture-independent?

Dahe

0 Kudos
10 Replies
SergeyKostrov
Valued Contributor II
758 Views
>>2.99366451289085517828e+01 >> >>2.99366451289085482301e+01 Results are identical up to 14th digit after the point. For Double Precision Floating Point data types 15th digits should be taken into account. It is not clear what software or hardware component is responsible for that very small difference. I would recommend you to verify binary values stored in a variable that stores a result of exp: ... const double v = 3.3990833195703927; double r = exp(v); printf( "exp(v) = %.20e\n", r ); ... for both hardware platforms. It is possible that a CRT library is responsible for that displayed difference. That is, different implementations of printf function.
0 Kudos
TimP
Honored Contributor III
758 Views

But big cpu math libraries aren't topical on this forum.

0 Kudos
TimP
Honored Contributor III
758 Views

You appear to be asking about the special math function entry points associated with icc imf arch consistency option.  As you ask about big cpu, this forum is not among the most suitable.

0 Kudos
jimdempseyatthecove
Honored Contributor III
758 Views

Sergey,

I think you intended to print in hexadecimal format

printf( "exp(v) = %.20e, %0llX\n", r, r ); // both double and Hex (64 bits)

I also agree with your statement that the difference may be attributable to the conversion of the double to the text string printed. Should the Hex formats show the same number, then this would indicate that the issue involves the conversion of the double to the text string.

Jim Dempsey

0 Kudos
McCalpinJohn
Honored Contributor III
758 Views

Using the utility at http://www.binaryconvert.com/convert_double.html, it is clear that these two values differ by one least significant bit.

This is not at all surprising, since Sandy Bridge must use separate Add and Multiply operations, while Haswell can use the more accurate FMA instruction.

The GNU C library does not aim to provide exactly the same results on all platforms.  The goals of the project and error estimates for many of the available functions are at https://www.gnu.org/software/libc/manual/html_node/Errors-in-Math-Functions.html.  ; On that page, the first goal of the project is to "provide a result that is within a few ULP of the mathematically correct value of the function".   A difference of one ULP across platforms is consistent with this definition.

If the goal is to get exactly the same bits on all platforms, it may be possible to trick the library into using the Sandy Bridge code path on the Haswell platform.  This will eliminate the FMA instructions and might provide identical results.   I don't know how to do this with the GNU compilers and libraries, but there is probably enough information at https://lwn.net/Articles/691932/ to get started on understanding how this works and how it might be controlled.

0 Kudos
SergeyKostrov
Valued Contributor II
758 Views
>>...the difference may be attributable to the conversion of the double to the text string printed. Should the Hex formats show the same >>number, then this would indicate that the issue involves the conversion of the double to the text string... Correct. I know about that problem since had issues with Borland C++ compiler. That is, binary values were correct for all supported on a project C++ compilers but displayed value was different for Borland C++ compiler if compared with all the rest C++ compilers.
0 Kudos
SergeyKostrov
Valued Contributor II
758 Views
>>...The GNU C library does not aim to provide exactly the same results on all platforms... I didn't have big troubles related to accuracy of processing with these GCC-based versions of MinGW C++ compiler on Intel Pentium II, Intel Pentium 4, Intel Atom and Intel Ivy Bridge architectures for Windows 95, Windows 2000, Windows XP and Windows 7 OSs: ... | 02 | MinGW v3.4.2 | 03 | MinGW v4.8.1 | 04 | MinGW v4.9.0 | 05 | MinGW v4.9.2 | 06 | MinGW v5.1.0 | 07 | MinGW v6.1.0 ... For Linux Ubuntu ( on Intel Ivy Bridge ) and Red Hat EL ( on a KNL ) tests will be done soon. PS: Sorry if we're off the primary topic of the forum...
0 Kudos
TimP
Honored Contributor III
758 Views

gcc -mno-fma -march=native are frequently used options to avoid introducing fma along with avx2, but this will not influence library math function code.  That is one of the features of intel arch-consistency math function entry points.  There is of course no documentation or assured support for calling those entry points by name. Also there appears to be no documentation about when avx2 math functions may be the more accurate.

0 Kudos
SergeyKostrov
Valued Contributor II
758 Views
>>...If the goal is to get exactly the same bits on all platforms... It is very important for mission critical applications in Healthcare, X-Ray imaging, MRI imaging, etc.
0 Kudos
McCalpinJohn
Honored Contributor III
758 Views

The AVX2 math functions that are able to use FMA operations should probably be assumed to be more accurate than the corresponding implementations that require rounding between the multiply and add operations.   (There will always be point-wise counter-examples, but unless the implementation is bad, any reasonable norm on a distribution of results should show the FMA-based results to be more accurate on "average".)

Unfortunately, emulating the increased accuracy of the Fused Multiply-Add is quite expensive on machines that only support separate operations, so if you want bit-wise reproducibility, you almost certainly need to try to reproduce the non-FMA results. 

Bitwise reproducibility will also require identical ordering of operations, which for some algorithms means that all results must be computed with the same vector length on all platforms.  A vector length of 1 is a convenient value, but a vector length of 2 doubles is probably supported by all of the platforms of interest. 

0 Kudos
Reply