Intel® DevCloud
Help for those needing help starting or connecting to the Intel® DevCloud
1463 Discussions

Java Floating Point Arithmetic and StrictMath Method Underflow and Overflow Error Correction.


Dear Intel,

This is updated information that pertains to open case number 05478099.

Java floating point errors occur by means of a representation issue to do with using the IEEE 754 floating point binary equation scheme during storage and arithmetic, and key class methods, involving floating point dot separator (“decimal point”) types in Java, namely, float and double, along with incorrect implementation in method logic inside the java.lang.StrictMath class.

Floating point errors in Java do occur in the decimal and hexadecimal “units in the last place” and further, beyond the end of digit accuracy to the right of the he dot separator, as is visible from the System.out.println() method. When they do occur, there is either a binary overflow or underflow. Those results, when converted back to base 10 or 16, break the contiguous range property of a ranged base 10 and base 16 number line, at an unknown magnitude to the represented base 10 or 16 value that is higher or lower than digit range accurate. At such an error instance, the wrong mapping is produced, as a duplicate to another mapping, and the correct mapping is skipped.

The following is a version of the IEEE 754 floating point binary number formula used in Java for float and double.:


It is defined by the the sign of the number's positivity or negativity, mantissa (data digits) of the registry in binary (m2), the exponent of the registry in binary (e2) which is the position of the dot separator from its presumpt starting point; this is all for a 32 bit or 64 bit numbers in Java (float or double), which are the value(s) for L, 32 or 64.

It is impossible to understand who could possibly, in any meaningful way, need or rely on, to start with, exactedly, an inconsistent and chaotic number value system, particularly in base 10. It is an error and a bug situation that should be corrected by a Java language vendor, and even the IEEE in their document which seems to be the origin of this.

Some people might claim that they want value approximations, noting a time reduction involved. But more important is the need for accuracy, which full and concrete thinking justifying and with no fuzziness involved.

Floating Point article published by Princeton University:
(see the section on Real-world numerical catastrophes.)

Java JDK/JRE Java runtimes generate floating point arithmetic or java.lang.StrictMath overflow and underflow errors within values as a result of Java source code like this, either in the defaulting base 10 mode, but also when hexadecimal notation for base 16 numbers starts to be used, which is denoted with a 0x inside float and double, the two Java homes of floating point. These are presently all error phenomena which can and aught be fixed. The classic example exists on the next page:

//The Java Language. Arithmetic only, no comparisons.
import static java.lang.System.*;
public class Start
public static void main(String ... args)
out.println("Program has started...");
float a = 0.1F;
float b = 0.1F;
float c = a*b;
double d = 0.1D;
double e = 0.1D;
double f = d*e;
out.println("Program has Finished.");

Program has started...




Program has Finished.

The standard that Java turns to for floating point, non-integral arithmetic is IEEE standard 754. While the IEEE is the home of the MAC address wordlwide, its standard document 754 doesn't specifically say anything about the base 10 or base 16 digit degradation via overflow or underflow that does happen at the right hand side of floating point decimal and hexadecimal number type value data past the dot separator, past the end of accuracy. The view which is somewhat older now that floating point can be an approximation trade-off for range accuracy, which is not contained in its name, which only implies mobile positioning of the dot separator, has become more than problematic. This is a view which is now compromising further access to the alternative of approximation, which must carry more weight.

One of the primary views around numbers, arithmetic and functions in modern times is that binary is for computers, and denary is for human beings. An approach which mixes these two up at the same time, while not maintaining separation between these concerns, only leads to logic confusion and errors. In the OpenJDK, float and double are the main offenders of this, however a similar problem occurs with java.lang.StrictMath method calls. Most importantly, this happens in relation to base 10, but base 16 will have the same problem, within the two Java floating point types. All examples of such are logic errors, that need to be repaired in either a default or mutual way, simply because denary accuracy is required out of a compiled and running Java program as efficiently as possible. Relevant Computer hardware that the writer of this document has in mind is the Desktop PC, running any ubiquitous operating system, or as configured as a database or internet server, as of 2022. Hardware and OS platforms that Java continues to install on by default.

Error workarounds are used to try to cover floating point errors, being BigInteger, BigDecimal, and the big-math function library and similar, in Java. They introduce an entire tier to Java software which isn't needed. BigInteger and BigDecimal are slow and produce a loss in speed, are larger in RAM and waste memory, and don't allow the use of arithmetic operator source code syntax. To say nothing about the absence of an included data accurate type calculator class. But BigInteger and BigDecimal only work outside classes or interfaces; if the internals of library classes or interfaces are written naively, or in any floating point error vulnerable fashion, and cannot be decompiled and are source code inaccessible, or are bound with other (unknown) computer language(s) in that state of affairs which won't be changing in the context, you are stuck with value errors being able to occur and corrupt the software. Things that developers and their programs always need good and better solutions for.
The IEEE should include or state something new in its standard 754 to encourage software language vendor(s) to implement floating point arithmetic more completely for overflow and underflow concerns, but if it doesn't, while the difficulty of change grows, programmers and vendor(s) are left to act on their own somehow. Oracle and the Java Community process have not apprehended repeated bug requests on their bugs system, and have chosen not to act further despite multiple discussion attempts about the reasons and needs involved, on their relevant public email lists. While it would be most appropriate for the most upstream vendor to implement these corrections, in face of a total refusal, the best remaining option for corrections that should, must, happen is to inquire of other vendors, being the purpose of this document.

Floating Point correction, in its most commensurate manner in relation to where the OpenJDK is right now (2022), can't be done with total compatibility. The entire change set per one version of Java can be contained in a separate, optional installation patch. There can be a floating point mode on/off switch for the runtime, classes, interfaces, fields, methods and operators with data, the only way to make floating point types and values range accurate without changing or removing the IEEE 754 equation binary mapping scheme is to augment or lengthen the 32 bit or 64 bit array, by some unknown and varying amount. The impacts of that in the Java language in hardware, but particularly the associated default Java libraries, are simply too huge to justify, along with departing from 32 and 64 bits in strict terms. The only other alternative, which is to adjust the curve equation and the binary to decimal and hexadecimal mappings Java uses for floating point storage and arithmetic evaluation to floating point storage of those types of data, despite adherence to standard or anything that that does (and in the end doesn't) mean:

n2=(-1)s*m2*2e(10)-(L-1), for positive or negative e10,

At the moment, the equation treats whole numbers and fractional values in decimal or hexadecimal differently. If the curve treated fractional values and whole values exactly the same, as digits symmetrical and integral around 1 and 0, then the present representation problems will go away. There are penalties involved with that suggested approach, but they are better than escalating calculations raising further difficulties.

A) The first penalty is that the fractional value ranges will end up having to parallel the whole value ranges, which will mean a range reduction from the first state of affairs, this:


Java Primitive Floating Point Range Approximate Examples.
float, Float: 32 bits, (+/-)(1.40*10^(-45), 3.40*10^(+38))
double, Double: 64 bits, (+/-)(4.94*10^(-324), 1.79*10^(+308))

to this:

Java Primitive Floating Point Range Approximate Examples.
float, Float: 32 bits, (+/-)(3.40*10^(-38), 3.40*10^(+38))
double, Double: 64 bits, (+/-)(1.79*10^(-308), 1.79*10^(+308))

With no way to count the number of floating point error base 10 results, and no unit conversion for any reciprocal, asymptotic fractions, yet by percentage calculation from the absolute value of the exponent of 10, I calculate approximately the following range losses will be involved:

Percentage (%) Value Range Loss for Floating Point Corrected Types.
float: 32 bits, aprox. -15.56% general range loss.
double: 64 bits, approx. -4.94% range loss.

he partial values will have to miss out for a shorter range, but in fact an accurate range without multiple holes and false duplicates in it; instead of a flawed range that lacks internal number mappings outright, and responds by issuing incorrect, repeated number maps. It can be argued that since it is not contiguous, and contains floating point errors, than in mathematical terms, there is no continuous or contiguous range present in float and double. It will also be the case that the vast majority of the time, in practice, and by obtaining float and double values through System.out.println(), these differences will almost all of the time never be visible or relevant anyway.

B) Floating point arithmetic binary value calculation will have to change. The bit manipulation and production will be different, and in a new context of different ranges. There will be no final decimal or hexadecimal digit rounding, just sheer truncation, and the ability to still perform the inverse of the previous arithmetic operation accurtely should give the exact same original result, a property which is almost always there now, but may need a little improvement. There is integer but decimal division too, comparisons with positive and negative whole and decimal values also, and even decimal positive and negative remainders from the relevant operator. The Java floating point arithmetic operators are:

+, -, *, /, %, +=, -=, *=, /=, %=, ++x, --x, x++, x--.

I have included the increment and decrement operators. It is the belief that systematic change on key use points called upon by systematic code will solve many problems, that it will correct other code by pulling systematic approaches along with it, but the fact will be that due to issues involving fine access internal to the whole of this matter, via bits to or from data, will mean that some changes will not only wash over, or entirely be able to have compatibility from the old to the new.

C) java.lang.StrictMath has to be repaired. The decimal values that it displays via System.out.println() all seem to be accurate, although I have not checked more deeply, meaning that farther, “deep” range values might have to be fixed. But the methods create floating point errors too, somehow, and will have to be re-implemented. Possibly by an appropriate C library, leveraged more carefully than not as present, with multi-platform final use still in mind.

D) What does this mean for Compatability, and Java class libraries?
It can mean that bit manipulation and shift operators that are used on type converted data values from and to floating point types won't be the same for decimal and hexadecimal data; that form of programming as its Java implementation stands now, will be separate to newer schemes, with previous, now “false facts”, about range values, and about how floating binary relates back to other numeric thresholds. The bit manipulation and shift operators themselves are fine, its just the way that they can apply under the old way of doing things with floating point which will create distinct problems. These will have to be at least checked or updated, for the entire standard library set of modules, classes and interfaces and possibly updated a bit for the new ranges. Any 3rd party library classes or interfaces that use bit manipulation and shift operators to affect floating point data in an IEEE 754 presumptive manner outside Java's own included default libraries will have to be avoided, or dealt with in broader manner, a subject I am researching at the present time. Most of the time, 3rd parties don't do this, or the issue can be sidestepped anyhow.

It means that anything around float and double that is written to file, read from file, communicated or finely processed may have to change, particularly use that presumes the IEEE arrangement very particularly. Functions named for IEEE could be ignored, since type conversions and other use functions will do the updated, requested work. They could be replaced internally, or they can be given companion methods that just perform the same general task, with consideration to the new state of things. Generic programming will not require change, a phenomenon this Java correction enterprise will rely on, since flexibility is a key advantages to reducing the workload.

D) Things like Serialisation, Remote Method Invocation RMI may or may not have to change, depending on how information gets associated and then apprehended for transmission. It depends on how they finally act, or what they do or don't presume as they read or write or process.
In Java, the ranges for float and double are asymmetrical around one, with more provision for digit fractions. By changing the equation by half, you gain the perfect mapping accuracy of the positive part of the curve into the smaller values part as well, dispensing with the complexity, gaps and false, spurious mappings that floating point is presently heir to from flawed logic, leading to neglect and erroneous circumstances, with no warning and in no coherent magnitude or direction that can be efficiently repaired within equivalent syntax and no range internal replacements being necessary at all.

Is Intel able to update the OpenJDK and JRE offerings for all platforms, to either repair at default or at switched capability, these floating point logic errors, or can they point me in the correct direction?

I would be thrilled to hear about a positive response!

0 Kudos
3 Replies


Thank you for posting in Intel communities.

Please can you let us know that is your query related to intel DevCloud? Are you using Intel DevCloud?

Thank You

0 Kudos

No, not exactly.  It is possible that I have posted in the wrong area on the forum.  My question here is just related

to Intel.  Is someone able to read this and share on what I have submitted anyway, which is my aim, here, please?

0 Kudos


Although Intel contributes to the OpenJDK implementation, the implementation is constrained by the Java Language Specification (sections 4.2.3-4.2.4 in my 2013 copy), which itself defines its floating point behavior to be based on a specific form of the IEE 754 specification (denormalized support, gradual underflow, round-towards-zero, etc).


One of the overriding design goals of Java numerics is that every implementation of Java, from simple interpreters to advanced JVMs, on tiny IoT devices or big-iron servers, will generate bit-for-bit identical results. BTW, there is a very detailed Java Compatibility Kit to verify this compatibility.


I suspect that any proposed changes to this would not only require a description of it’s benefits but a thorough evaluation (if not proof) of the absence of any new drawbacks. Especially considering that the drawbacks of the status quo have been understood for 25+ years. Any proposed changes would have to be at the Java Specification or IEEE level and are beyond the scope of Intel’s involvement.


Another approach would be to evaluate existing techniques or develop new ones to handle these inconsistencies between values in the floating-point and integer domains. 

0 Kudos