Solved: KIND for quadruple precision

eliosh · ‎12-08-2009

Accoring to the IEEE 754 standard quadruple precision numbers have 113 bits in mantissa (112 stored explicitly).
This corresponds to log10(2^113)=34.016 digital digits. However, when I try selected_real_kind(34) I recieve -1.
The maximal value that is accepted is 33, i.e., for selecter_real_kind(33) I get 16 as expected. Can sombody explain such a behavior. Does it mean that Intel's implementation does not confirm to the standard?

Thank you.

Steven_L_Intel1 · ‎12-09-2009

Denormal values are not considered part of the data type's "model", which is what SELECTED_REAL_KIND does, and you lose precision with such values.

View solution in original post

eliosh · ‎12-08-2009

I forgot to mention that I use 64-bit Linux compiler l_cprof_p_11.1.059

Tim_Gallagher · ‎12-08-2009

I've found a few different sources that say the quad. precision isn't actually in the IEEE 754 standard, but it's compliant because it's derived from specific IEEE 754 elements. So the exact format for it isn't really given in the standard.

In the compiler documentation, it says a REAL(KIND=16) is a "IEEE-style" which would indicate to me it's not spot on the standard. I recall reading somewhere that Intel defaults to non-IEEE compliant math but there's an option to force it to disable the optimized math and revert back to the slower, full precision IEEE math (I think, I'm trying to find it now). A quick search of the docs shows the following pages:

http://www.intel.com/software/products/compilers/docs/flin/main_for/mergedProjects/bldaps_for/common/bldaps_datarepov.htm

and

http://www.intel.com/software/products/compilers/docs/flin/main_for/fpops/fortran/fpops_fpnum_f.htm

The last of which says the compiler uses a close approximation to the standard.

Tim

Tim_Gallagher · ‎12-08-2009

For what it's worth:

http://www.intel.com/software/products/compilers/docs/flin/main_for/fpops/fortran/fpops_flur_f.htm

Which explains that it's 2^112 bits, not 2^113, giving 33 digits.

Tim

Steven_L_Intel1 · ‎12-08-2009

At the time we first implemented quad-precision, IEEE754 didn't have such a thing. Our implementation does match the current IEEE754 and 33 digits is correct. The 113th bit is always 1 so there are only 112 "changeable" bits.

Hirchert__Kurt_W · ‎12-08-2009

Quoting - eliosh

Accoring to the IEEE 754 standard quadruple precision numbers have 113 bits in mantissa (112 stored explicitly).
This corresponds to log10(2^113)=34.016 digital digits. However, when I try selected_real_kind(34) I recieve -1.
The maximal value that is accepted is 33, i.e., for selecter_real_kind(33) I get 16 as expected. Can sombody explain such a behavior. Does it mean that Intel's implementation does not confirm to the standard?

Thank you.

As best I can tell, Intel's implementation conforms in this regard. SELECTED_REAL_KIND returns its values based on the value of the PRECISION intrinsic. For a binary machine with 113 bits in its mantissa, the formula for PRECISION is

INT((113-1)*log10(2.)) = INT(112*.301) = INT(33.7) = 33.

The difference between your formula and the standard's derives from the difference in what you mean by the decimal precision of a representation and what the standard means. The standard's formula is correct for its meaning of decimal precision. I could attempt to explain the difference, but the result would likely be a bit long and tedious, so unless someone asks for that explanation, I will skip it.

-Kurt

eliosh · ‎12-09-2009

Quoting - hirchert

As best I can tell, Intel's implementation conforms in this regard. SELECTED_REAL_KIND returns its values based on the value of the PRECISION intrinsic. For a binary machine with 113 bits in its mantissa, the formula for PRECISION is

INT((113-1)*log10(2.)) = INT(112*.301) = INT(33.7) = 33.

The difference between your formula and the standard's derives from the difference in what you mean by the decimal precision of a representation and what the standard means. The standard's formula is correct for its meaning of decimal precision. I could attempt to explain the difference, but the result would likely be a bit long and tedious, so unless someone asks for that explanation, I will skip it.

-Kurt

Kurt, I will be glad to see your explanation.
Meanwhile, the only difference, I can see, between my formula and yours is using 113 vs 112 digital bits.

This is also related to the answer of Steve who said "...there is only 112 changeable digits...". As far as I understand, the hidden bit can be considered perfectly changeable especially when the standard allows sub-normal numbers.

Thank you.

Steven_L_Intel1 · ‎12-09-2009

Denormal values are not considered part of the data type's "model", which is what SELECTED_REAL_KIND does, and you lose precision with such values.

Hirchert__Kurt_W · ‎12-09-2009

Quoting - eliosh

Kurt, I will be glad to see your explanation.
Meanwhile, the only difference, I can see, between my formula and yours is using 113 vs 112 digital bits.

This is also related to the answer of Steve who said "...there is only 112 changeable digits...". As far as I understand, the hidden bit can be considered perfectly changeable especially when the standard allows sub-normal numbers.

Thank you.

My formula was for 113 bits. The difference between our two formulas is that I am using the formula for floating-point decimal equivalence and you are using the formula for integers. I encourage you to look at the formula in the specification of the PRECISION intrinsic. Since you asked, I will explain later in this post the derivation of that formula.

The "hidden" bit is changeable only in a very limited exponent range, and when it is, you get it by losing the low bit and thus are less precise. For this reason, the usual way to fit IEEE number to the standard model is to ignore denormals. If you do include them, you can claim a greater exponent range, but you must claim even less precision. Since most people select numbers on the basis of precision rather than exponent range, implementors would rather claim the highes precision possible, so they don't try to fit in the values represented by the denormals.

Why is the formula for floating-point precision is what it is? The intuitive definition the commitee used is that a particular mantissa size can support d decimal digits if it is capable of providing distinct representations for all d decimal digit floating-point numbers. (Note that since conceptually there are an infinite number of possible exponent values, there are an infinite number of such floating-point values.) In general, numbers expressed using a different number base cannot all be expressed exactly, so what we want is that the nearest representation to different decimal numbers be different using our mantissa. For this to happen, the representations using our mantissa must be at least as close together as the decimal number we need to represent.

How close are those d-digit decimal numbers? At one extreme, the difference between .999...9x10**e and .100...0x10**(e+1) is .000...1x10**e or 10**(e-d), making the relative difference 10**(e-d)/10**e or 10**(-d). At the opposite extreme, the difference .100...0x10**e and .100...1x10**e also is 10**(e-d), but the numbers themselves are smaller, so the relative separation is only 10**(-(d-1)).

How close are the numbers in a mantissa with p base b digits? Using analogous logic, we see that the relative spacing ranges from b**(-p) to b**(-(p-1)).

In the general case, we cannot make assumptions about how the spacing in the decimal numbers lines up with the spacing in our mantissa, so the only way to ensure that our numbers are closer together than the decimal numbers they need to represent is to ensure that our widest relative spacing [b**(-(p-1))] is at least as small as the narrowest relative spacing of the decimal numbers [10**(-d)]. Take you log10s and you get d = INT((p-1)*log10(b)).

Astute readers will note that this isn't quite the formula in the language specification. In the special case that our mantissa has a base that is a power of 10 (especially 10, but also including such less likely possibilities such as 100, 1000, 10000, etc.) then we can guarantee how the spacing of our mantissa lines up to the spacing of decimal numbers. In particular, when our spacing is widest [b**(-(p-1))], the spacing of decimal numbers will also be at its widest [10**(-(d-1))], so we effectively get an extra decimal digit. This is the source of the "+k" in the formula in the standard. I know of no Fortran implementations on machines whose radix is a power of 10, so in practice k is always 0.

I warned you that the explanation of the derivation of the formula would be long and not particularly easy to read. You may not agree with the intuitive model the committee chose, but I hope you can see that the formula in the standard is the right one for the model they chose.

-Kurt

TimP · ‎12-09-2009

Perhaps we have a language perception difference on the meaning of "intuitive." For the most part, the definition choices in P754 standard make sense to me. An exception might be the decision to make [1.0....1.999999....) the base range for the "fraction," and their failure to persuade the C standards committee to make that switch from prior practice, where a normalized fraction would be [.500000.....999999....).
Some quibbling has gone on about quad precision. The original P754 defined double extended precision, which has the same exponent format later adopted for IEEE quad precision, but a precision of 64, and no gradual underflow, as seen on x87. P854 extended the definition to general choices of precision, and at least since that time, software quad precision implementations (and hardware assisted ones, such as on Itanium) have complied with it, with the specific case of 128-bit storage, 113 bit precision being incorporated in the P754 revision proposal.

Hirchert__Kurt_W · ‎12-10-2009

Quoting - tim18

Perhaps we have a language perception difference on the meaning of "intuitive."

At the very least, I was using a different sense of the meaning of "intuitive." Perhaps I should have chosen a different adjective.

What I was trying to express is that in the formulation of a large standard like a language standard, there are often unstated informal rules that serve as the basis or motivation for the stated formal rules in the standard. One simple informal rule may be the basis for formal rules that are complex, numerous, or obscure. In that sense, the informal rule is more "intuitive" than the collective formal rules on the same subject.

On the other hand, that informal rule may not be "intuitive" in the sense that it is the one your intuition would lead you to expect.

I'm sorry if this confused anyone.

-Kurt

jimdempseyatthecove · ‎12-11-2009

>>An exception might be the decision to make [1.0....1.999999....) the base range for the "fraction," and their failure to persuade the C standards committee to make that switch from prior practice, where a normalized fraction would be [.500000.....999999....).

And this is a problem with both standards committies being stuck in declaring standards in base 10. The fundamental computation being performed in binary, both standards committies should be able to agree on binary representation. Then be free to document precision according to their own rules of declared precision

When expressed as "1.0 .. 1.9999...." you get nn.mm digits of precision

or (each committee respectively)

When expressed as "0.5 ... .9999...." you get nn.mm digits of precision

And when expressed this way, both committies can also document the precision of the other frame of reference.
And also state that the underlaying precision (binary) is the same for both languages.

Jim Demspey

eliosh · ‎12-12-2009

Thanks a lot to all who answered my question. The things are now clear to me.