Consider the following code as an example
version 14.0 of svml calculates the sin(90) as 1 while latest svml calculates it as 0.99999.
This can obviously create problems if subsequent calculations rely on the sin result to be exactly one.
Could someone please advice what is the best way to deal with these roundoff errors in newer versions of the libraries.
program sum real angle_degrees(100) real angle_radians(100) pi = 4 * atan (1.0_4) pi180 = pi/180.0 angle_degrees = 90 angle_radians = sin(angle_degrees * pi180 ) print * , angle_radians end program sum
This is the usual issue with floating point numbers. For floating point numbers you should never test for exact equality
but for equality within a given precision. E.g. something like the code below. Check the documentation on the flags for the numerical models in order
to look what the exact behavior is.
elemental function nearly_equal_real (a, b, abs_smallness, rel_smallness) result (r) logical :: r real(default), intent(in) :: a, b real(default), intent(in), optional :: abs_smallness, rel_smallness real(default) :: abs_a, abs_b, diff, abs_small, rel_small abs_a = abs (a) abs_b = abs (b) diff = abs (a - b) ! shortcut, handles infinities and nans if (a == b) then r = .true. return else if (ieee_is_nan (a) .or. ieee_is_nan (b) .or. ieee_is_nan (diff)) then r = .false. return end if abs_small = tiny_13; if (present (abs_smallness)) abs_small = abs_smallness rel_small = tiny_10; if (present (rel_smallness)) rel_small = rel_smallness if (abs_a < abs_small .and. abs_b < abs_small) then r = diff < abs_small else r = diff / max (abs_a, abs_b) < rel_small end if end function nearly_equal_real
1) the epsilon function should be used in combination with the magnitudes of the input arguments such that the precision takes into account of these magnitudes.
2) ieee_is_... intrinsic functions have significant overhead, if possible perform what you can before making these tests.
3) While one NAN is not equal to one not-NAN, two NAN's are neither equal nor unequal
But then these are relative to 1E3, 6 and 9. When you have numbers of much larger (e.g. astronomical scale) or smaller (atomic scale) then the magnitude of your tiny(s) must be futzed with. You might want to consider
tiny = min(a,b) * epsilon(zero) * YouPickNumberRelativeToLSB (e.g 2, 4, ...)