Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- Intel Community
- Software
- Software Development Topics
- Software Tuning, Performance Optimization & Platform Monitoring
- Float pointing exceptions

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-07-2015
09:18 PM

136 Views

Float pointing exceptions

P.S: I used the article «What Every Computer Scientist Should Know About Floating-Point Arithmetic» DAVID GOLDBERG.

Link Copied

14 Replies

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
03:51 AM

136 Views

Which Floating-Point exception has occurred in your code?

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
05:19 AM

136 Views

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
05:34 AM

136 Views

Here is description of Inexact Exception: http://docs.oracle.com/cd/E19422-01/819-3693/ncg_handle.html

Put it simply rounded approximated result is different from the infinitely precise result. Think about the approximation of such a value like 0.3

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
05:39 AM

136 Views

What you are trying to calculate?

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
05:57 AM

136 Views

For example, I will summarize the two numbers (0xe39d413c6f4d7d9f and 0xe39ff6e30bcff322). Using material from the article, about which I wrote, I think that should not Having set the bit "inexact". My CPU "thinks" differently. I wanted to understand why.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
06:19 AM

136 Views

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
06:31 AM

136 Views

McCalpinJohn

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-12-2015
10:39 AM

136 Views

I am not an expert on these issues, but I believe that the "inexact" status is raised whenever rounding causes any bits to be dropped.

Using the online conversion tool at http://babbage.cs.qc.cuny.edu/IEEE-754.old/64bit.html, I see that for the two values above, the exponents are the same and the fractional parts (with the implicit leading bit included) are:

0xe39d413c6f4d7d9f --> 11101010000010011110001101111010011010111110110011111

0xe39ff6e30bcff322 --> 11111111101101110001100001011110011111111001100100010

The sum of the fractional parts is --> 111101001110000001111101111011000111010111000011000001

Adding the fractional parts results in a carry, which means that the lowest-order-bit of the sum must be handled by rounding when the result is normalized. Since the value of the lowest-order bit is "1", rounding either up or down is clearly "inexact".

Presumably the inexact status would not be raised if all of the bits that need to be dropped in the normalization step are zero.

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-13-2015
12:45 AM

136 Views

Владимир_Б_

Beginner

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-13-2015
12:49 AM

136 Views

Protective bits become important if the exponent of the input operands are not the same.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-13-2015
01:08 AM

136 Views

>>>In fact, these numbers determine approximately equal>>>

Yes you are right. I used wrong IEEE 754 converter which led me astray. Thanks for spotting the error.

McCalpinJohn

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-13-2015
11:58 AM

136 Views

The algorithm for setting the inexact bit is not discussed in the IEEE-754 standard, but the definition is certainly clear -- a result is "inexact" if it differs from the result that would be obtained with an unbounded exponent field and an unbounded fraction field.

An algorithm that might work is: If any non-zero bits are dropped due to either shifting of input values OR if any non-zero bits are dropped due to normalization of output values, then it is presumed that the result does not match the infinite-precision result and the inexact exception is raised.

This may not be sufficiently precise. It might be possible for the shift of the input value to drop bits in a way that exactly counteracts the effect of the normalization of the output value, leading to a "false positive" using the algorithm above. I am sure that smart people have figured a robust way of setting this that does not require actually having the infinite precision result, but it is hard to get very excited about it -- the class of FP operations that *do not* produce inexact results is sufficiently small that the ability to trap on the exception is not useful very often.

Bernard

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-17-2015
04:59 AM

136 Views

I wonder how CPU can approximate infinitely precise "exact" result?

I think that in case of double precision FP 56-bit fractional part can represent at some degree exact result and when during the calculation there is recorded loss if significand digits then inexact exception can be raised.

McCalpinJohn

Black Belt

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

05-18-2015
08:00 AM

136 Views

Not surprisingly, people have figured out how to avoid the possible "false positive" case that I mentioned above.

A readable but reasonably thorough reference is available at http://www.cs.ucla.edu/digital_arithmetic/files/ch8.pdf

This reference shows that keeping three extra bits of precision is sufficient to guarantee that all IEEE 754 rounding modes can be performed correctly and that the inexact exception can be detected unambiguously. The trick is that the 3rd extra bit (called the "sticky bit") must be the logical OR of all of the additional bits of the intermediate computation.

- For add/subtract operations the number of "additional bits" depends on how many bits the smaller argument must be shifted to the right before the operands are properly aligned for the addition. If any of the bits being shifted "off the end" are non-zero, then the "sticky bit" will be set for use in the rounding and inexact flag setting steps.
- For multiplication operations the number of "additional bits" is equal to the number of bits of each operand -- i.e., multiplying two values with "m" bit fractions will produce a "2m" bit intermediate result, but (except for the Fused Multiply-Add operation) only three extra bits need to be kept (provided that the third one is the "sticky bit").

Using the notation of the reference above, the inexact exception is raised if G+R+T=1, where G and R are the two bits of the intermediate result immediately below the low-order bit of the final (normalized) result and T is the "sticky bit" (the logical "OR" of all additional bits of the intermediate result).

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

For more complete information about compiler optimizations, see our Optimization Notice.