MKL vs Microsoft exp() function - Page 2

David_C_12 · ‎05-24-2013

I have a client that is migrating a large C++ software base from 32- to 64-bit code in MS Visual Studio. One of the problems they are having is that the 32- and 64-bit versions of the C library exp() function produce results that differ by 1ulp for some operands, and this is causing regression tests to fail. One potential solution I am considering is to use Intel MKL instead of the Microsoft library. So I have a few questions:

1. Do the 32-bit and 64-bit builds of MKL produce identical results for exp() and other transcendental functions, for all operands, assuming that SSE2 is enabled for our 32-bit code?

2. Although the client has mostly Intel hardware, I believe they have a few AMD Opteron-based server farms. Does MKL work on Opterons? If so, are there any performance penalties if MKL is used in place of the Microsoft library?

3. Is there any way of getting the Microsoft .NET framework to use MKL? I assume it may have the same 32/64-bit differences, although I haven't tested that yet.

4. What other benefits might my client gain by switching to MKL?

Thanks in advance - dc42

David_C_12 · ‎05-26-2013

Hi Sergey,

Long double is the same as double on the Microsoft platform (i.e. 64 bits), so is not of interest to us. Regarding binary A/B/C, they are not results, they are operand values for which 32-bit SSE2 and 64-bir exp() return different results. We don't use _control87, we accept the Microsoft default which is that calculations are always rounded to 64 bits - that should give us better agreement between x87 calculations and SSE2 calculations.

Bernard · ‎05-26-2013

>>>Hi illyapolak, is _Clexp the one that gets called if you have SSE2 code generation and intrinsic functions enabled?>>>

If you could post disassembly of called exp() implementation then I could compare it with msvcrt exp() exports.

SergeyKostrov · ‎05-26-2013

>>Long double is the same as double on the Microsoft platform (i.e. 64 bits), so is not of interest to us. I see and I won't stress on it any longer >>Regarding binary A/B/C, they are not results, they are operand values for which 32-bit SSE2 and >>64-bir exp() return different results... I understand that and at the same time I did some analysis of these numbers. Please take a look at my notes because you're dealing with differences in these values and they are very close to Epsilon for double floating-point data type.

SergeyKostrov · ‎05-26-2013

[ Note 1 ] ... #define IPP_EPS52 ( 2.2204460492503131e-016 ) ... Epsilon52 = 2.2204460492503131e-016 A = 0.0066998698389698335 -> BinaryA = 0x3F7B71529D88875E B = 0.0066998698389698344 -> BinaryB = 0x3F7B71529D88875F C = 0.0066998698389698352 -> BinaryC = 0x3F7B71529D888760 D = 0.0066998698389698300 -> BinaryD = 0x3F7B71529D88875A - Note: Last 2-digits of (D) are set to 0 BinaryA = 00111111 01111011 01110001 01010010 10011101 10001000 10000111 01011110 BinaryB = 00111111 01111011 01110001 01010010 10011101 10001000 10000111 01011111 BinaryC = 00111111 01111011 01110001 01010010 10011101 10001000 10000111 01100000 BinaryD = 00111111 01111011 01110001 01010010 10011101 10001000 10000111 01011010 Numbers (A-D), (B-D), and (C-D) are less than Epsilon (!) for a Double data type: (A-D) -> ( 0.0000000000000000035 - EPS52 ) = -0.00000000000000021854460492503131 (B-D) -> ( 0.0000000000000000044 - EPS52 ) = -0.00000000000000021764460492503131 (C-D) -> ( 0.0000000000000000052 - EPS52 ) = -0.00000000000000021684460492503131 0.0000000000000000035 -> BinaryA2 = 0x3C5024121797FB36 0.0000000000000000044 -> BinaryB2 = 0x3C544A9A66CDB0D6 0.0000000000000000052 -> BinaryC2 = 0x3C57FB1390C48B2C

SergeyKostrov · ‎05-26-2013

[ Note 2 ] (A-D) = -0.00000000000000021854460492503131 -> BinaryA3 (B-D) = -0.00000000000000021764460492503131 -> BinaryB3 (C-D) = -0.00000000000000021684460492503131 -> BinaryC3 BinaryA3 = 0xBCAF7EDF6F434026 BinaryB3 = 0xBCAF5DAB2CC99279 BinaryC3 = 0xBCAF40276379DBA7 [ Note 3 ] abs(A-D) = 0.00000000000000021854460492503131 -> BinaryA4 abs(B-D) = 0.00000000000000021764460492503131 -> BinaryB4 abs(C-D) = 0.00000000000000021684460492503131 -> BinaryC4 Epsilon52 = 2.2204460492503131e-016 -> BinaryE4 BinaryA4 = 0x3CAF7EDF6F434026 BinaryB4 = 0x3CAF5DAB2CC99279 BinaryC4 = 0x3CAF40276379DBA7 Epsilon52= 0x3CB0000000000000 abs(A-D) = 0.00000000000000021854460492503131 -> BinaryA4 abs(B-D) = 0.00000000000000021764460492503131 -> BinaryB4 abs(C-D) = 0.00000000000000021684460492503131 -> BinaryC4 [ Note 4 ] 0.00000000000000021854460492503131 -> BinaryA4 0.00000000000000021764460492503131 -> BinaryB4 0.00000000000000021684460492503131 -> BinaryC4 2.2204460492503131e-016 -> BinaryE4 BinaryA4 = 00111100 10101111 01111110 11011111 01101111 01000011 01000000 00100110 BinaryB4 = 00111100 10101111 01011101 10101011 00101100 11001001 10010010 01111001 BinaryC4 = 00111100 10101111 01000000 00100111 01100011 01111001 11011011 10100111 BinaryE4 = 00111100 10110000 00000000 00000000 00000000 00000000 00000000 00000000 [ Note 5 ] 0.0000000000000001 = More accurate representation = 9.99999999999999979097786724035E-17 Epsilon52......... = More accurate representation = 2.22044604925031308084726333618E-16 ( Epsilon52 ) 0.0000000000000002 = More accurate representation = 1.99999999999999995819557344807E-16 ( Epsilon52 Truncated ) 0.0000000000000010 = More accurate representation = 1.00000000000000007770539987666E-15

SergeyKostrov · ‎05-26-2013

Please take a look at these threads: Forum topic: Mixing of Floating-Point Types ( MFPT ) when performing calculations. Does it improve accuracy? Web-link: software.intel.com/en-us/forums/topic/361134 Forum topic: Tips for Porting software to a 64-bit platform Web-link: software.intel.com/en-us/forums/topic/277738

Bernard · ‎05-26-2013

@Sergey

I am just curious do you use Win7 calculator for all those decimal->hex->binary conversions?

SergeyKostrov · ‎05-27-2013

>>...I am just curious do you use Win7 calculator for all those decimal->hex->binary conversions? No.

SergeyKostrov · ‎05-27-2013

David, Please take a look at float.h header file: ... #define DBL_DIG 15 /* # of decimal digits of precision */ ... and it means that only 15 digits in a variable of a long double type need to be used ( I know that it is really hard to follow that rule in practical applications... ). Next, I proved that rounding to 17 digits ( instead of using all 19 digits of input test values ) provides consistent results for tests with 32-bit applications compiled with different C++ compilers and MKL / IPP libraries.

SergeyKostrov · ‎05-27-2013

Here are results of my tests on a 32-bit platform and you need to complete similar tests on a 64-bit platform before making a final decision on how to solve the problem. 32-bit test application ( No Rounding for Test Values ) ... RTdouble dV11 = CrtExp( ( long double )0.0066998698389698335L ); RTdouble dV21 = CrtExp( ( long double )0.0066998698389698344L ); RTdouble dV31 = CrtExp( ( long double )0.0066998698389698352L ); CrtPrintf( RTU("V11 = %.18f\nV21 = %.18f\nV31 = %.18f\n"), dV11, dV21, dV31 ); CrtPrintf( RTU("\n") ); ... [ Intel C++ compiler ] V11 = 1.006722364175213900 V21 = 1.006722364175213900 V31 = 1.006722364175213900 [ Microsoft C++ compiler ] V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700 [ Borland C++ compiler ] V11 = 1.006722364175213658 Note: Provides better accuracy and doesn't round last three digits ( 658 ~= 700 ) V21 = 1.006722364175213658 V31 = 1.006722364175213658 [ MinGW C++ compiler ] V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700

SergeyKostrov · ‎05-27-2013

32-bit test application ( Rounding for Test Values / Last two digits are set to 0 ) ... RTdouble dV11 = CrtExp( ( long double )0.0066998698389698300L ); RTdouble dV21 = CrtExp( ( long double )0.0066998698389698300L ); RTdouble dV31 = CrtExp( ( long double )0.0066998698389698300L ); CrtPrintf( RTU("V11 = %.18f\nV21 = %.18f\nV31 = %.18f\n"), dV11, dV21, dV31 ); ... [ Intel C++ compiler ] V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700 [ Microsoft C++ compiler ] V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700 [ Borland C++ compiler ] V11 = 1.006722364175213658 Note: Provides better accuracy and doesn't round last three digits ( 658 ~= 700 ) V21 = 1.006722364175213658 V31 = 1.006722364175213658 [ MinGW C++ compiler ] V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700

SergeyKostrov · ‎05-27-2013

32-bit test application with MKL library ( No Rounding for Test Values ) [ Intel C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 [ Microsoft C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 [ Borland C++ compiler ] vdExp: dR[0] = 1.006722364175213658 Note: Provides better accuracy and doesn't round last three digits ( 658 ~= 700 ) dR[1] = 1.006722364175213658 dR[2] = 1.006722364175213880 Note: Provides better accuracy and doesn't round last three digits ( 880 ~= 900 ) vmdExp: dR[0] = 1.006722364175213658 dR[1] = 1.006722364175213658 dR[2] = 1.006722364175213880 [ MinGW C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900

SergeyKostrov · ‎05-27-2013

32-bit test application with MKL library ( Rounding for Test Values ) [ Intel C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 [ Microsoft C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 [ Borland C++ compiler ] vdExp: dR[0] = 1.006722364175213658 Note: Provides better accuracy and doesn't round last three digits ( 658 ~= 700 ) dR[1] = 1.006722364175213658 dR[2] = 1.006722364175213658 Note: Provides better accuracy and doesn't round last three digits ( 658 ~= 700 ) vmdExp: dR[0] = 1.006722364175213658 dR[1] = 1.006722364175213658 dR[2] = 1.006722364175213658 [ MinGW C++ compiler ] vdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 vmdExp: dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700

SergeyKostrov · ‎05-27-2013

Epxression '...Provides better accuracy and doesn't round last three digits ( 658 ~= 700 )...' means that CRT-function printf of Borland C++ compiler provides better accuracy because it doesn't use any Microsoft's CRT libraries and functions.

SergeyKostrov · ‎05-27-2013

32-bit test application with IPP library ( No Rounding for Test Values ) ... // IPPAPI( IppStatus, ippsExp_64f_A50, ( const Ipp64f a[], Ipp64f r[], Ipp32s n ) ) // IPPAPI( IppStatus, ippsExp_64f_A53, ( const Ipp64f a[], Ipp64f r[], Ipp32s n ) ) CrtPrintf( RTU("\n") ); Ipp64f a50[4] = { 0.0066998698389698335L, 0.0066998698389698344L, 0.0066998698389698352L }; Ipp64f r50[4] = { 0.0L, 0.0L, 0.0L, 0.0L }; Ipp64f a53[4] = { 0.0066998698389698335L, 0.0066998698389698344L, 0.0066998698389698352L }; Ipp64f r53[4] = { 0.0L, 0.0L, 0.0L, 0.0L }; IppStatus st = ippStsNoErr; st = ippsExp_64f_A50( &a50[0], &r50[0], 4 ); CrtPrintf( RTU("r50[0] = %.22f\nr50[1] = %.22f\nr50[2] = %.22f\n"), a50[0], a50[1], a50[2] ); st = ippsExp_64f_A53( &a53[0], &r53[0], 4 ); CrtPrintf( RTU("r53[0] = %.22f\nr53[1] = %.22f\nr53[2] = %.22f\n"), a53[0], a53[1], a53[2] ); ... [ Microsoft C++ compiler with IPP library ] r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213900 [ Intel C++ compiler with IPP library ] r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213700

SergeyKostrov · ‎05-27-2013

32-bit test application with IPP library ( Rounding for Test Values ) ... // IPPAPI( IppStatus, ippsExp_64f_A50, ( const Ipp64f a[], Ipp64f r[], Ipp32s n ) ) // IPPAPI( IppStatus, ippsExp_64f_A53, ( const Ipp64f a[], Ipp64f r[], Ipp32s n ) ) CrtPrintf( RTU("\n") ); Ipp64f a50[4] = { 0.0066998698389698300L, 0.0066998698389698300L, 0.0066998698389698300L }; Ipp64f r50[4] = { 0.0L, 0.0L, 0.0L, 0.0L }; Ipp64f a53[4] = { 0.0066998698389698300L, 0.0066998698389698300L, 0.0066998698389698300L }; Ipp64f r53[4] = { 0.0L, 0.0L, 0.0L, 0.0L }; IppStatus st = ippStsNoErr; st = ippsExp_64f_A50( &a50[0], &r50[0], 4 ); CrtPrintf( RTU("r50[0] = %.22f\nr50[1] = %.22f\nr50[2] = %.22f\n"), a50[0], a50[1], a50[2] ); st = ippsExp_64f_A53( &a53[0], &r53[0], 4 ); CrtPrintf( RTU("r53[0] = %.22f\nr53[1] = %.22f\nr53[2] = %.22f\n"), a53[0], a53[1], a53[2] ); ... [ Microsoft C++ compiler with IPP library ] r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213700 [ Intel C++ compiler with IPP library ] r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213700

SergeyKostrov · ‎05-27-2013

David, Here are results on a WIndows 7 Professional 64-bit with a 64-bit test application: [ Microsoft C++ compiler - 64-bit / VS 2008 Professional Edition / No Rounding of Test Values ] exp V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700 IPP r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213900 MKL - vdExp dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900 MKL - vmdExp dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213900

SergeyKostrov · ‎05-27-2013

Here are results on a Windows 7 Professional 64-bit with a 64-bit test application: [ Microsoft C++ compiler - 64-bit / VS 2008 Professional Edition / Rounding of Test Values ] exp V11 = 1.006722364175213700 V21 = 1.006722364175213700 V31 = 1.006722364175213700 IPP r50[0] = 1.006722364175213700 r50[1] = 1.006722364175213700 r50[2] = 1.006722364175213700 r53[0] = 1.006722364175213700 r53[1] = 1.006722364175213700 r53[2] = 1.006722364175213700 MKL - vdExp dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700 MKL - vmdExp dR[0] = 1.006722364175213700 dR[1] = 1.006722364175213700 dR[2] = 1.006722364175213700

Bernard · ‎05-28-2013

>>>Here are results on a WIndows 7 Professional 64-bit with a 64-bit test application:>>>

It seems that MS and Intel implementations use the same algorithm.Probably Horner scheme with precalculated coefficinets.

SergeyKostrov · ‎05-28-2013

David, Here is a test case with MKL functions vdExp and vmdExp: ... // _Mkl_Api( void, vdExp, ( const MKL_INT n, const double a[], double r[] ) ) // _Mkl_Api( void, vmdExp, ( const MKL_INT n, const double a[], double r[], MKL_INT64 mode ) ) double dA[4] = { 0.0066998698389698335L, 0.0066998698389698344L, 0.0066998698389698352L }; // double dA[4] = { 0.0066998698389698300L, 0.0066998698389698300L, 0.0066998698389698300L }; double dR[4] = { 0.0L, 0.0L, 0.0L, 0.0L }; ::vdExp( 4, &dA[0], &dR[0] ); printf( "dR[0] = %.18f\ndR[1] = %.18f\ndR[2] = %.18f\n", dR[0], dR[1], dR[2] ); printf( "\n" ); ::vmdExp( 4, &dA[0], &dR[0], 0 ); printf( "dR[0] = %.18f\ndR[1] = %.18f\ndR[2] = %.18f\n", dR[0], dR[1], dR[2] ); ... Note: A commented line has Rounded values.

David_C_12 · ‎05-29-2013

Here's the code I used to print the values returned by exp(x) for some ranges of interest:

#include <SDKDDKVer.h>
#include <stdio.h>
#include <tchar.h>
#include "float.h"
#include "math.h"

int _tmain(int argc, _TCHAR* argv[])
{
// Print out a sequence of adjacent double values and the corresponding values returned by exp()
//double d = -0.083296271058700000;
//const double dir = -1.0;
double d = 6.6998698389698000e-3;
const double dir = 1.0;

for (int i = 0; i < 100; ++i)
{
double e = exp(d);
printf("%20.19f %20.17f\n", d, e);
d = _nextafter(d, dir);
}
return 0;
}

When this program is run, the results for 32- and 64-bit runs differ for a few values in the range.

Regards - David