Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Concerns on using AVX double floating point instructions for integer data

cagribal
Beginner
6,264 Views

Hi all,

As you might know, AVX does not provide instructions for integer types, which are planned to arrive with AVX2. I have a code written using AVX instructions, which basically use _mm256_*_pd() variants of instructions that operate on double-precision floating-point values (the instructions I use are min, max, shuffle, blend, load, loadu, etc.). However my data is actually integers, which I load by casting integer pointers to double pointers, i.e. __m256d reg = _mm256_loadu_pd((double*)intPtr) etc. Functionality wise the code seems to do what I expect, i.e. sorts the data. However, as I haven't tested with all sorts of different data, I'm concerned whether the output will always be correct. What corner cases should I be concerned with? Would the comparisons will always be correct or will there be some integer values where the AVX floating point comparison would not work?

Thanks for comments and suggestions

0 Kudos
37 Replies
Jeffrey_A_Intel
Employee
3,967 Views
From IEEE Std 754-2008, section 5.11:
Four mutually exclusive relations are possible: less than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN shall compare unordered with everything, including itself.
Thus, comparisons involving integers whose bit pattern matches that of a floating-point NaN would be problematic.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
Of course you can do int-to-double cast in order to use AVX, however... >>...Would the comparisons will always be correct or will there be some integer values where the AVX floating point comparison >>would not work? I would be very carefull because your processing will be dependent on limitation of IEEE 754 Standard and, as recommended in many-many sources, a comparison with an Epsilon could be added ( expect a performance impact ). If your tests are deterministic ( No Random data ) an accuracy of processings, I mean based in integers and then based on doubles, could be verified as soon as both outputs are saved. There are single- and double-precision binary format viewers on the web and you could look / verify how some integer values will look like after conversion to double type.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...Thus, comparisons involving integers whose bit pattern matches that of a floating-point NaN would be problematic... That looks interesting and could you give us at least one example when some integer value could be converted to a double-precision NaN value?
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...Thus, comparisons involving integers whose bit pattern matches that of a floating-point NaN would be problematic... I'm very surprized when Intel engineers make some statements without any real verification(s) ( sometimes very simple ), like: [ Test-case ] ... int iIsNan = 0; double dValue = -1.0; double dValueLn = 0.0L; unsigned __int64 iValue = 0U; printf( "dValue = %f\n", dValue ); printf( "dValueLn = %f\n", dValueLn ); printf( "iValue = %I64d\n", iValue ); dValueLn = CrtLog( dValue ); printf( "dValueLn = %f\n", dValueLn ); iValue = ( __int64 )dValueLn; printf( "iValue = %I64d\n", iValue ); iIsNan = _isnan( dValueLn ); if( iIsNan == 0 ) printf( "dValueLn is Not NaN\n" ); else printf( "dValueLn is NaN\n" ); dValue = ( double )iValue; printf( "dValue = %f\n", dValue ); iValue = 9223372036854775800i64; dValue = 0.0L; printf( "iValue = %I64d\n", iValue ); printf( "dValue = %f\n", dValue ); dValue = ( double )iValue; printf( "dValue = %f\n", dValue ); iIsNan = _isnan( dValue ); if( iIsNan == 0 ) printf( "dValue is Not NaN\n" ); else printf( "dValue is NaN\n" ); ... [ Output ] dValue = -1.000000 dValueLn = 0.000000 iValue = 0 dValueLn = -1.#IND00 iValue = -9223372036854775808 dValueLn is NaN dValue = 9223372036854775800.000000 iValue = 9223372036854775800 dValue = 0.000000 dValue = 9223372036854775800.000000 dValue is Not NaN Please let me know if you find any problems with the test-case. Best regards, Sergey { UPDATED }Fixed: printf( "iValue = %f\n", iValue ); to printf( "iValue = %I64d\n", iValue );
0 Kudos
Patrick_F_Intel1
Employee
3,967 Views
Hello cagribal, I assume when you say 'integers' you do mean 4 byte signed variables... so 32bit and includes one sign bit. The double precision IEEE mantissa is 53 bits plus one sign bit. If the question is, can every 32bit integer value be converted to double and, when I convert back to integer, will I get back the original integer? The answer to this is yes. If you are just doing compares (that is, not changing the value of your converted 32bit INTs) in your AVX code, you will not get NANs, and you will get the compare results you expect (there will be no unordered results). Pat
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
Hi everybody, There are cases ( I detected 3 so far ) wheh 64-bit Integer ( boundary signed & unsigned ) and Double-Precision values do not match. Please take a look at cases 2.x: [ Output ] Test-Case 1 dValue = -1.000000 dValueLn = 0.000000 iValue = 0 dValueLn = -1.#IND00 iValue = -9223372036854775808 dValueLn is NaN dValue = 9223372036854775800.000000 Verifications for Boundary values ( signed and unsigned ) of 64-bit range: Test-Case 2.1 iValueS = 9223372036854775807 dValue = 0.000000 dValue = 9223372036854775800.000000 dValue is Not NaN Test-Case 2.2 iValueS = -9223372036854775808 dValue = 0.000000 dValue = -9223372036854775800.000000 dValue is Not NaN Test-Case 2.3 iValueU = 9223372036854775807 dValue = 0.000000 dValue = 9223372036854775800.000000 dValue is Not NaN Test-Case 2.4 iValueU = 0 dValue = 0.000000 dValue = 0.000000 dValue is Not NaN I'll post source codes of my quick test later after additional verification.
0 Kudos
Patrick_F_Intel1
Employee
3,967 Views
64bit integers (if the span of non-zero bits in the 64bit integer is more than 53 bits) cannot be represented without a loss of precision. That is, converting a 64bit integer to double and back to 64bit may or may not give you back the original 64bit integer, depending on how many bits are used in the original 64bit integer. But 32bit integers will be okay.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...64bit integers (if the span of non-zero bits in the 64bit integer is more than 53 bits) cannot be represented without a loss of precision... Exactly and this is how it looks like: >>... >>Test-Case 2.1 >>iValueS = 9223372036854775807 >>... >>dValue = 9223372036854775800.000000 >>... Thanks Patrick for the comment!
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...What corner cases should I be concerned with? Look for a Patrick's post for a case with 32-bit integers. There are 2 generic cases wheh 64-bit Integer ( boundary signed & unsigned ) and Double-Precision values do not match ( 64-bit is converted to 53-bit DP as Patrick mentioned in his post ). You need to verify some range of boundary integer values ( next to min and max values ). >>...Would the comparisons will always be correct or will there be some integer values where the AVX floating point comparison >>would not work? Yes if a precision of the source integer value is not lost during the conversion. Does it make sense?
0 Kudos
cagribal
Beginner
3,967 Views
Hi all, thanks for your replies. @Patrick: Actually, as integer I meant 64-bit signed integers. So as I understood, it is possible that some 64-bit integer might have bit pattern of NaN and might result in an incorrect result. Here are small test cases that I'm using: [cpp] double NaN; *(uint64_t *)(&NaN) = 0x7FF0000000000001; // Test.1) Prints "NEQ : nan" , as NaN != NaN if(NaN == NaN) printf("EQ : %.20f\n", NaN); else printf("NEQ : %.20f\n", NaN); double x = 87.0d; // Test.2) Prints Unordered as comparison with a NaN is always Unordered if(NaN < x) printf("LT\n"); else if(NaN > x) printf("GT\n"); else if(NaN == x) printf("EQ\n"); else printf("Unordered\n"); // Test.3) Comparisons with AVX, basically min(NaN, 10) returns NaN (?) int64_t arr1[4] = {10, 20, 30, 40}; int64_t arr2[4] = {50, 20, 40, 10}; *(double *)(&arr2[0]) = NaN; __m256d a = _mm256_loadu_pd((double *) arr1); __m256d b = _mm256_loadu_pd((double *) arr2); printf("A = "); p256i(a); // A = AVXVector: {10 ; 20 ; 30 ; 40} printf("B = "); p256i(b); // B = AVXVector: {9218868437227405313 ; 20 ; 40 ; 10} __m256d ret = _mm256_min_pd (a, b); printf("MIN = "); p256i(ret); // MIN = AVXVector: {9218868437227405313 ; 20 ; 30 ; 10} [/cpp]
0 Kudos
Patrick_F_Intel1
Employee
3,967 Views
Hello Cagribal, Yes, one can certainly generate double precision NANs from 64bit bit patterns. And one can generate 64bit ints which won't convert to doubles without loss of precision (such as bigint = (1LL << 55) + 1.) From my old PhD days, there were whole sections dedicated to what can/can't be represented/converted and back. You will need to check that your 64bit integer ranges do not exceed the 53 bit mantissa of the double precision value. Pat
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
Hi everybody, >>...I'll post source codes of my quick test later after additional verification... Here it is: ... int iIsNaN = 0; // Test-Case 1 printf( "Test-Case 1\n" ); double dValue = -1.0; double dValueLn = 0.0L; unsigned __int64 iValue = 0U; printf( "\tdValue = %f\n", dValue ); printf( "\tdValueLn = %f\n", dValueLn ); printf( "\tiValue = %I64d\n", iValue ); dValueLn = CrtLog( dValue ); printf( "\tdValueLn = %f\n", dValueLn ); iValue = ( unsigned __int64 )dValueLn; printf( "\tiValue = %I64d\n", iValue ); iIsNaN = _isnan( dValueLn ); if( iIsNaN == 0 ) printf( "\tdValueLn is Not NaN\n" ); else printf( "\tdValueLn is NaN\n" ); dValue = ( double )iValue; printf( "\tdValue = %f\n", dValue ); printf( "Verifications for Boundary values ( Signed and UnSigned ) of 64-bit range:\n" ); __int64 iValueS = 0LL; unsigned __int64 iValueU = 0ULL; // Test-Case 2.1 printf( "Test-Case 2.1\n" ); iValueS = ( 9223372036854775807LL ); dValue = 0.0L; printf( "\tiValueS = %I64d\n", iValueS ); printf( "\tdValue = %f\n", dValue ); dValue = ( double )iValueS; printf( "\tdValue = %f\n", dValue ); iIsNaN = _isnan( dValue ); if( iIsNaN == 0 ) printf( "\tdValue is Not NaN\n" ); else printf( "\tdValue is NaN\n" ); // Test-Case 2.2 printf( "Test-Case 2.2\n" ); iValueS = ( -9223372036854775807LL - 1 ); dValue = 0.0L; printf( "\tiValueS = %I64d\n", iValueS ); printf( "\tdValue = %f\n", dValue ); dValue = ( double )iValueS; printf( "\tdValue = %f\n", dValue ); iIsNaN = _isnan( dValue ); if( iIsNaN == 0 ) printf( "\tdValue is Not NaN\n" ); else printf( "\tdValue is NaN\n" ); // Test-Case 2.3 printf( "Test-Case 2.3\n" ); iValueU = ( 9223372036854775807ULL ); dValue = 0.0L; printf( "\tiValueU = %I64d\n", iValueU ); printf( "\tdValue = %f\n", dValue ); dValue = ( double )iValueU; printf( "\tdValue = %f\n", dValue ); iIsNaN = _isnan( dValue ); if( iIsNaN == 0 ) printf( "\tdValue is Not NaN\n" ); else printf( "\tdValue is NaN\n" ); // Test-Case 2.4 printf( "Test-Case 2.4\n" ); iValueU = ( 0ULL ); dValue = 0.0L; printf( "\tiValueU = %I64d\n", iValueU ); printf( "\tdValue = %f\n", dValue ); dValue = ( double )iValueU; printf( "\tdValue = %f\n", dValue ); iIsNaN = _isnan( dValue ); if( iIsNaN == 0 ) printf( "\tdValue is Not NaN\n" ); else printf( "\tdValue is NaN\n" ); ...
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...it is possible that some 64-bit integer might have bit pattern of NaN and might result in an incorrect result... I'll do a couple of tests and I'll be back. Thanks guys for that really nice discussion!
0 Kudos
Bernard
Valued Contributor I
3,967 Views
>>>Exactly and this is how it looks like: >>... >>Test-Case 2.1 >>iValueS = 9223372036854775807 >>... >>dValue = 9223372036854775800.000000 >>... Please bear in mind that exact implementation of printf()(I mean here some kind of formatting performed by this function) should be also taken into account when the same primitive types are converted from one type to other.The best example of such a conversion,albeit not applicable to your case is reduction of long double 80-bit type to 64-bit which is performed by MSVCRT printf() function.
0 Kudos
Jeffrey_A_Intel
Employee
3,967 Views
I'm very surprized when Intel engineers make some statements without any real verification(s)...
Perhaps you missed this part of the original post: However my data is actually integers, which I load by casting integer pointers to double pointers... If one of those "doubles" now points to 64 bits which has the long int value 92211202370041090560 (= 0x7ff8000000000000), it will be intepreted as a (quiet) NaN, and it will compare as "unordered" with any other value.
0 Kudos
Jeffrey_A_Intel
Employee
3,967 Views
Make that 9221120237041090560 and not 92211202370041090560.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...in mind that exact implementation of printf()(I mean here some kind of formatting performed by this function) should be also taken into account... It affects only how the value is displayed not as how it is stored.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>...I'll do a couple of tests and I'll be back... Here is a small Test-Case 1.2 ... // Test-Case 1.2 printf( "Test-Case 1.2\n" ); unsigned __int64 iNaNIntValue = 0ULL; // iNaNIntValue = 0x1020304050607080; dValueLn = 0; iNaNIntValue = 18444492273895866368ULL; // 0xfff8000000000000 = NaN-raw-value ( binary representation ) dValueLn = ( double )iNaNIntValue; iIsNaN = _isnan( dValueLn ); if( iIsNaN == 0 ) printf( "\tdValueLn is Not NaN\n" ); else printf( "\tdValueLn is NaN\n" ); ... When debugging this is how variables look like in a Visual Studio 'Memory' window: [ 'double' with NaN value ] ... 00 00 00 00 00 00 f8 ff ... [ '__int64' after assignment from 'double' with NaN value ] ... 00 00 00 00 00 00 00 80 ... So, it looks like a developer should watch out for a 0xfff8000000000000 or 18444492273895866368 value. No and let me continue. Next, if a developer converts it back to 'double' it will get 0x43efff0000000000 or 4895411695440101376 and that is done by a C++ compiler (!). It looks like a magic but actually there are No any uncertanties here because only 53 bits (!) will be copied into mantissa and a part of 64-bit integer which is "responsible" for a NaN-code won't be re-created in the 'double'. So, this is not possible to create a NaN value in a double precision variable from a 64-bit integer variable by doing a simple cast, like: ... dValueLn = 0; iNaNIntValue = 18444492273895866368ULL; dValueLn = ( double )iNaNIntValue; ... unless a developer copies these 8 bytes with a 'memcpy' CRT function directly.
0 Kudos
SergeyKostrov
Valued Contributor II
3,967 Views
>>... >>unless a developer copies these 8 bytes with a 'memcpy' CRT function directly. Something like that: ... // Test-Case 1.3 printf( "Test-Case 1.3\n" ); void *pdValueLn = &dValueLn; void *piNaNIntValue = &iNaNIntValue; memcpy( ( void * )pdValueLn, ( const void * )piNaNIntValue, 8 ); iIsNaN = _isnan( dValueLn ); if( iIsNaN == 0 ) printf( "\tdValueLn is Not NaN\n" ); else printf( "\tdValueLn is NaN\n" ); ... [ Output ] ... Test-Case 1.3 dValueLn is NaN ... Once again, this is not possible to create a NaN value in a double precision variable from a 64-bit integer variable by doing a simple cast.
0 Kudos
Bernard
Valued Contributor I
3,836 Views
>>>It affects only how the value is displayed not as how it is stored.>>> Yes , but the stored value is encoded by the compiler and/or hardware so the compiler's vendor can implement it differently.Look at case of Intel primitive long double type and its truncation to 64-bit double precision type.
0 Kudos
Reply