- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have a question about the compliance of the binary floating point library (libbfp754). In the documentation, a "Note" exists for all of the Quiet Computational Operations (abs, copy, copysign, negate) that states the following:
"When the input is a signaling NaN, two different outcomes are allowed by the standard. The operation could either signal invalid exception with quieted signaling NaN as output, or deliver signaling NaN as output without signaling any exception."
But I believe this is in contradiction with section 5.5.1 of the IEEE 754-2008 standard (not reprintable here), which explicitly states that these functions will only ever affect the sign bit and will not throw exceptions.
I have only seen the bfp754 library perform correctly (passing through a signaling NaN and not signaling an exception) on the two different systems I have compile and run my code on, but I need to know if the other behavior (returning a quiet NaN and signaling an exception) is actually allowed by the library, potentially making my code non-compliant on certain machines.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, I believe I've found a bug in the library, where calling __binary32_from_hexstring("0x0.000001P-126") should return a single precision representation of the minimum subnormal number 1.40129846e-45, but instead returns 0 (0x0).
Another bug I've foun is that the __binary*_to_hexstring(*) functions are undefined. Using them and compiling causes the following errors:
error: identifier "__binary32_to_hexstring" is undefined
error: argument of type "float" is incompatible with parameter of type "char *"
Any assistance you could provide regarding these issues with the IEEE754-2008 library are greatly appreciated.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My attached file contains calls to all the functions I'm having issues with. I compile it with the line below.
icpc -fp-model source -fp-model except -g -O0 -std=c++0x -D__GXX_EXPERIMENTAL_CXX0X__ test.c -o cogeTest -lbfp754
The functions definitions you've given are correct and I included a call to one in my code. Those definitions, however, contradict the documentation at http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-lin/index.htm
Is this the correct location for the most recent documentation?
Which says that the string output is a return value and not an argument. Now that I know that the strings are arguments to the functions, myissues are as follows.
(1) Providing the correct HEX string for minimal subnormal number does not produce the binary 32-bit minimal subnormal number.
(2) The documentation for the Quiet Computational Operations does not match the IEEE standard, but so far the behavior of those functions does.
(3) The documenation for the *_to_*string functions is incorrect.
Could you please point me to correct documentation, and address the functional bug in (1)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I don't see an example of your codes and could you upload it again?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
These are compilation errors that will show up if you uncomment the part of the sample code I sent that is commented out. I'm also compiling this on Linux using the compilation command I gave above. The error you get seems like it's related to the "-fp-model source" flag. If you uncomment the code in the sample I included, the following list of errors occurs:
-bash-4.1$ icpc -fp-model source -fp-model except -g -O0 -std=c++0x -D__GXX_EXPERIMENTAL_CXX0X__ -c test.c -o test.o -lbfp754
test.c(27): error: argument of type "float" is incompatible with parameter of type "char *"
hexStr32 = __binary32_to_hexstring(rand32);
^
test.c(27): error #165: too few arguments in function call
hexStr32 = __binary32_to_hexstring(rand32);
^
test.c(27): error: a value of type "void" cannot be assigned to an entity of type "char *"
hexStr32 = __binary32_to_hexstring(rand32);
^
test.c(30): error: argument of type "float" is incompatible with parameter of type "char *"
hexStr32 = __binary32_to_hexstring(min32SubNorm);
^
test.c(30): error #165: too few arguments in function call
hexStr32 = __binary32_to_hexstring(min32SubNorm);
^
test.c(30): error: a value of type "void" cannot be assigned to an entity of type "char *"
hexStr32 = __binary32_to_hexstring(min32SubNorm);
^
test.c(35): error: a value of type "const char *" cannot be used to initialize an entity of type "char"
char min64SubNormHex = "0x0.0000000000001P-1022";
^
test.c(45): error: expression must have pointer-to-object type
hexStr64 = &(min64SubNormHex[0]);
^
test.c(49): error: argument of type "float" is incompatible with parameter of type "char *"
hexStr64 = __binary64_to_hexstring(rand64);
^
test.c(49): error #165: too few arguments in function call
hexStr64 = __binary64_to_hexstring(rand64);
^
test.c(49): error: a value of type "void" cannot be assigned to an entity of type "char *"
hexStr64 = __binary64_to_hexstring(rand64);
^
test.c(52): error: argument of type "double" is incompatible with parameter of type "char *"
hexStr64 = __binary64_to_hexstring(min64SubNorm);
^
test.c(52): error #165: too few arguments in function call
hexStr64 = __binary64_to_hexstring(min64SubNorm);
^
test.c(52): error: a value of type "void" cannot be assigned to an entity of type "char *"
hexStr64 = __binary64_to_hexstring(min64SubNorm);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, that is the correct output for the part of the sample that is not commented out. And that part uses the correct syntax for the *_to_hexstring functions. That correct syntax is not what is specified in the documentation. If you were to uncomment out the rest of the sample, which uses the syntax that is specified by the documentation, the errors will occur. So that shows issue #3 from my prior post.
The third line of output "0x0.000001P-126 --> 0.000000" show's issue #1 from my prior post. The string "0x0.000001P-126" should produce the minimum positive subnormal floating point number, which is "1.40129846e-45", but it instead produces 0.0.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
An additional informative test now that the _to_hexstring functions are working, the following lines of code
float min32SubNorm = 1.40129846e-45;
__binary32_to_hexstring(test,min32SubNorm);
printf("%e --> %s\n", min32SubNorm, test);
will produce the following output ...
1.401298e-45 --> 0x0.000002p-126
This indicates that the intel FP library interprets the minimun subnormal floating point value as being represented with a significant of 2, instead of 1.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The link I use is
http://software.intel.com/sites/products/documentation/doclib/stdxe/2013/composerxe/compiler/cpp-lin/index.htm
It is the online documentation for Composer XE 2013, the chapter on the C++ compiler, and the sub-chapter of documentation referring to the Intel IEEE 754 Floating Point Conformance Library.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, I've further characterized the problem I've been having with the Intel FP library _to_hexstring and _from_hexstring functions. My files are attached to this post, and are compiled with the command line below.
icpc -fp-model source -fp-model except -g -O0 -std=c++0x -D__GXX_EXPERIMENTAL_CXX0X__ test.c -o cogeTest -lbfp754
The output of running the code is below
Min Pos Normal:
1.175494e-38 (0x00800000) --> 0x1.000000p-126
0x1.000000P-126 --> 1.175494e-38 (0x00800000)
Min Neg Normal:
-1.175494e-38 (0x80800000) --> -0x1.000000p-126
-0x1.000000P-126 --> -1.175494e-38 (0x80800000)
Max Pos Normal:
3.402823e+38 (0x7F7FFFFF) --> 0x1.fffffep127
0x1.7FFFFFP127 --> 2.552118e+38 (0x7F400000)
0x1.FFFFFEP127 --> 3.402823e+38 (0x7F7FFFFF)
Max Neg Normal:
-3.402823e+38 (0xFF7FFFFF) --> -0x1.fffffep127
-0x1.7FFFFFP127 --> -2.552118e+38 (0xFF400000)
-0x1.FFFFFEP127 --> -3.402823e+38 (0xFF7FFFFF)
Min Pos SubNormal:
1.401298e-45 (0x00000001) --> 0x0.000002p-126
0x0.000001P-127 --> 0.000000e+00 (0x00000000)
0x0.000002P-126 --> 1.401298e-45 (0x00000001)
Min Neg SubNormal:
-1.401298e-45 (0x80000001) --> -0x0.000002p-126
-0x0.000001P-127 --> -0.000000e+00 (0x80000000)
-0x0.000002P-126 --> -1.401298e-45 (0x80000001)
Max Pos SubNormal:
1.175494e-38 (0x007FFFFF) --> 0x0.fffffep-126
0x0.7FFFFFP-127 --> 2.938736e-39 (0x00200000)
0x0.FFFFFEP-126 --> 1.175494e-38 (0x007FFFFF)
Max Neg SubNormal:
-1.175494e-38 (0x807FFFFF) --> -0x0.fffffep-126
-0x0.7FFFFFP-127 --> -2.938736e-39 (0x80200000)
-0x0.FFFFFEP-126 --> -1.175494e-38 (0x807FFFFF)
There are a few issues ...
Issue #1: Shifting of hexadecimal bits in floating point significand
The first issue is related to how the {hexSignificant} hexidecimal digits are interpreted when converting between binary floating point formats and hexidecimal character strings. Referencing section 5.12.3 of the IEEE754-2008 standard, the {hexSignificand} string of hexidecimal digits is interpreted as a character sequence described by the ISO C99 standard. The significand part of the hexidecimal character sequence is up to 6 hexidecimal characters representing the 23-bit significand, where (from the IEEE754 standard) "the first (leftmost) character is the most significant". As you can see from the output of all of the test cases (but most obviously in the test cases for Min/Max Normal numbers), the hexidecimal character sequences that are produced by the Intel FP library uses and interprets hexadecimal character sequences where the first (leftmost) bit is the most significant. This is obviously bad practice and confusing that the 23-bits would be represented in hex as (0bXXXXXXXXXXXXXXXXXXXXXXX << 1 ), and this is not how a 23-bit number would be represented in hex according to the C99 standard that the IEEE754 standard references to for this type of formatting. This is apparent when looking at the hexidecimal representations of the floating point numbers I've included in the code output. A significand of all 1's, printed as hex is 0x7FFFFF, not 0xFFFFFE. So this, as far as I can tell, conflicts with the standard. This problem exists for the single precision functions only because in double precision the significand is 52 bits and can be cleanly packed into 13 hex digits.
Issue #2: Incorrect exponent biasing/debiasing for subnormal numbers
This issue is demonstrated in the code output pertaining to the SubNormal numbersl. SubNormal numbers, as specified in the IEEE754-2008 standard, have a biased exponent value of 0x0. The bias for binary single precision is 127, so that the unbiased exponent of a SubNormal number is -127. However, the Intel FP library expects and interprets the unbiased exponent for SubNormal numbers to be -126. This problem, like the first, does not exist for double precision.
These issues could be related. Using 24-bits to represent a 23-bit significand could cause the exponent to not be handled correctly, but both issues (and the code I included) demonstrate how the *_hexstring functions do not meet the IEEE754-2008 standard.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The numbers printed to the screen don't show the entire precision of the numbers being used in the code, which is a result of the default precision used in printf. The attached file displays more precision.
1.17549435e-38 is the min normal used
1.17549421e-38 is the max subnormal used
both yield the results I've shown on binaryconvert.com as well

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page