- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I just upgrade to to compiler version 19.0.0.117 from 18.0.3.210. In reviewing some of my codes I noticed small discrepancies in the results. Tracing these discrepancies led to the evaluation of the cube root function. In version 18, cube root (exponentiation to 1/3) is handled by cbrtf. In version 19 this changes to powf. This appears to be a reversal of what happened between versions 11 and 12 (see: https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/279705). This simple code snippet can be used to observe the change:

real b real ans_b b= 1500. ans_b= b**(1./3.)

In version 18 the result is 11.4471426

In version 19 the result is 11.4471435

Any reason for this change? Especially since the version 18 implementation seems to give the more accurate answer.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Ronald W Green (Intel) wrote:As a side note, with the 19 compiler, there is no way to force the compiler to use cbrtf() IF you specify -fp-model source. You asked for it, you got it (power function).

But what if you asked for cbrtf() instead of powf()? Does this still work in 19?

module M use ISO_C_BINDING implicit none interface function cbrtf(x) bind(C,name='cbrtf') import implicit none real(C_FLOAT) cbrtf real(C_FLOAT), value :: x end function cbrtf end interface end module M program P use M implicit none real x, y x = 1500 y = cbrtf(x) write(*,'(f10.7)') y end program P

Link Copied

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The two differ only in the least significant (i.e., 23rd) bit. They are Z'4137277F' and Z'41372780' in internal IEEE 32-bit format.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

mecej4 wrote:The two differ only in the least significant (i.e., 23rd) bit. They are Z'4137277F' and Z'41372780' in internal IEEE 32-bit format.

In my application, consistency of the result is important, so I care when the answer changes, even by one bit, and especially when that change results in a less accurate result.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I think a `real` has only precision to (7 significant figures) 5 decimals in this case `11.44714` only. Anything beyond that is rubbish. If print out several more digits you can see that, compared to the true numeric answer of `11.447142425533318678080422119396770...`

Below are the more than enough digits of the above calculation using 32-bit, 63-bit and 128-bit real values

! precission result ! 32 bits11.44714260101318359375000 ! 64 bits11.44714242553331828844421 ! 128 bits11.447142425533318678080422119396768 ! ***11.447142425533318678080422119396770..

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

What is the default 'real' that you are using and the desired precision level? If you use a real(4), the answer to your example is 11.44714 and you should not expect higher precision regardless of the compiler version used. If higher precision is needed, you better adjust the precision level of your floating point variables, for example:

PROGRAM TestCubeRoot IMPLICIT NONE REAL(4) :: a, b REAL(4),PARAMETER :: onethird = 1.0/3.0 REAL(8) :: c, d REAL(8),PARAMETER :: onethird_d = 1.0D0/3.0D0 b = 1500.0 a = b**onethird d = 1500.0D0 c = d**onethird_d WRITE(*,*) "a = ", a WRITE(*,*) "c = ", c READ(*,*) END PROGRAM TestCubeRoot

gives the following results (in decimal form) with either compiler version you mention:

a = 11.44714

c = 11.4471424255333

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

The responders to this thread are missing the point. Intel changed the function used to calculate cube roots from cbrtf to powf in going from version 18 to 19. This change results in a less precise result, and that loss of precision is carried over in the code, but not seen in default print the results. In the specific case of 1500^(1/3) that difference is in the last bit of the binary representation of the solution. I have generated cases where the difference is more than 1 bit.

Saying that the answer is 11.44714 is incorrect. The answer is 11.4471426010... since the answer can't be exactly represented in binary. Unfortunately, Version 19 thinks the answer is 11.4471435547....which isn't the same number in binary and this difference propagates into future calculations. I don't think it is unreasonable to ask Intel why a repeatable and accurate method of calculating cube roots was replaced in Version 19. I use Fortran, and specifically Intel Fortran, for its efficient and accurate math functions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

I suggest you open an issue with the Intel Online Service Center. That way you'll get an appropriate response that originates with the code generator developers. Otherwise you would be dependent on an Intel support engineer reading this and deciding on their own to open an issue. Admittedly, I used to do this all the time, but that was just me being me.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Michael D. wrote:The answer is 11.4471426010... since the answer can't be exactly represented in binary.

Mathematically the correct answer is `11.447142425533318678..` and not what you expect. The closest 32-bit floating point representation is `11.4471426` or `0x4137277F`.

As for as the switch on the library function for cube root, it might be a product of the compiler options. Are you sure the two compilations have the same floating point compiler settings? Are you using fast-math or strict-math?

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Michael D., regarding the Version 19 (initial and 19.0.1.144) vs. 18 (version 18.0.1.156) compiler output, on my machine, I get consistently the same value with both versions (x64 as solution platform), namely 11.44714260101318... with 32 bit floating point (beyond intended/correct precision) for the example above.

So, are you sure that your different result is not due to different compilation / run-time settings you use with version 19 vs. 18 or a different CPU? My tests suggest that the 'powf' generated result is not worse than that from older compilers.

Edit: I can reproduce the difference when using the /fp:source switch for the floating point model (but not with 'Fast').

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Intel has been working on getting more consistent results in our FP model. 18 was not being as strict with fp-model source. 19 is more strict.

And 19 is correct - if you specify SOURCE and your source is

a = b**x

where x is any expression, including (1./3.) you are asking for b to be raised to the x and stored in a. if x is (1./3.) the order of evaluation is (1./3.) done in SINGLE PRECISION, raise b to that power in single precision pow() call, store as single-precision to a. With SOURCE that means 'do it exactly as I have it coded'. This is a power function call.

I argue that what 18 and earlier compilers were doing was wrong. It was looking at the expression of x being 1./3. and saying "oh look, that's really a cubed root. we have a fast cubed root function. I'll do him a favor and simply call this cubed root function.". Same as if x were 1.0/2.0 the compiler can say "oh look, he's doing a square root" and replace the USER INTENDED POWER FUNCTION incorrectly with sqrtf or cbrtf. That is WRONG if you have asked for FP MODEL SOURCE.

As a side note, with the 19 compiler, there is no way to force the compiler to use cbrtf() IF you specify -fp-model source. You asked for it, you got it (power function).

and I'm sure you noticed that if you take off FP MODEL SOURCE you will get the older results. It's not calling cbrtf in the assembly I looked at, it's inlining the calculation w/o a function call of any kind.

The compiler MAY choose not to use the power function in the future, inline, or maybe someday it'll use cbrtf again if that is found to be fast for newer architectures and algorithms for cubed root. Depends on options. The compiler typically, at O2 or O3, will pick a fast method where "fast" is relative the current architectures and their instruction sets.

And, since 1./3. is imprecise, you can get the old results with (1._8/3._8) which will do 1/3 in double precision, raise b to double, call pow() with double arguments, down-convert to single to store to a. And 1/3 could be done with division or a fast lookup table. Again, it can make a difference. On a processor with fast division in hardware (modern) it'll probably do the division. Older days (div instruction slow), a table lookup might've been faster. If you don't control the compiler, it'll pick any fast method where "fast" can change over time.

The Forum users pointed out that IEEE single precision is accurate only to ~7 decimal digits. So my testcase probably looks like your code:

program sqr implicit none real :: b=1500.0 real :: ans_b ans_b= b**(1./3.) write(*,'(A10,F10.7)') "Floating: ", ans_b write(*,'(A10,Z8)' ) "Hex: ", ans_b end program sqr

IF you really need 9 digits of precision, the Fortran Standard way of getting this is similar to this form:

program sqr implicit none integer, parameter :: d9=selected_real_kind(9) real(d9) :: b=1500.0_d9 real(d9) :: ans_b ans_b= b**(1.0_d9/3.0_d9) write(*,'(A10,F10.7)') "Floating: ", ans_b write(*,'(A10,Z)' ) "Hex: ", ans_b end program sqr

Which will give you your answer to 9 decimal digits of accuracy. Yes, it selects KIND=8 because that is what you need to guarantee 9 decimal digits of accuracy. This makes it very clear to the compiler and future users your intended accuracy requirements. I would add that using the Fortran Standard kind method above protects your code from compiler version changes, compiler vendor changes, and architecture changes.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Ronald W Green (Intel) wrote:As a side note, with the 19 compiler, there is no way to force the compiler to use cbrtf() IF you specify -fp-model source. You asked for it, you got it (power function).

But what if you asked for cbrtf() instead of powf()? Does this still work in 19?

module M use ISO_C_BINDING implicit none interface function cbrtf(x) bind(C,name='cbrtf') import implicit none real(C_FLOAT) cbrtf real(C_FLOAT), value :: x end function cbrtf end interface end module M program P use M implicit none real x, y x = 1500 y = cbrtf(x) write(*,'(f10.7)') y end program P

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Using the /fp: source switch, here is more clear example of the differences between compiler version 18 and 19, where 7 decimal digit repeatability is not maintained:

x = 1.5374178E-14

y = x**(1./3.)

in 18: y = 0.24865507E-04 (HEX: 37D09645)

in 19: y = 0.24865498E-04 (HEX: 37D09640)

Note, it is also a 5 binary bit change in the results.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

explicitly calling cbrtf from Fortran: yes, this will call will be left as-is and honored by any version of the compiler. The Fortran compiler cannot (should not) replace your call, and it will NOT inline the C function inside the Fortran caller. This is an excellent way to future-proof your application.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

the literal you have "1.5374178E-14" is using 8 decimals, as you have to count the digits to the left of the decimal point as well. And keep in mind, 32-bit IEEE numbers are accurate to "approximately 7 digits of precision". If you print*, selected_real_kind(7) you will find that compilers will tell you that if you need exactly 7 digits OR MORE of accuracy they will use real kind=8 (or the kind number for double precision which can vary by compiler vendor: 8 for most, but kind=2 for NAG )

Your example, for example, gfortran 7.3 gives the same result as Intel 19, and without any fp-model:

gfortran -O0 -o ex2_g73 ex2.f90

rwgreen@orcsle147:~/quad/04018265$ ./ex2_g73

x 288A7A71

y 37D09640

This should give you concern. If you really need the accuracy of 7 digits, remember my selected real kind example - if you use selected_real_kind(7), this will force everything to double precision so you get at least 7 digits of accuracy ( instead of "approximately 7 digits" in 32-bit FP representation). Is there a reason you're not using double precision? You seem very concerned about getting 7-8 digits of accuracy.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

In thinking about this some more, your solution to call cbrtf() is best. The problem is 1.0/3.0 is not cannot be represented in a finite number of digits, base 10 or base 2. Thus, raising something to this imprecision value is inherently imprecise.

Your example of using ISO_C_BINDING to cbrtf() is best. You want a cubed root. You should ask for a cubed root, not raising something to the 1/3 power.

AND you example is portable. Here is gfortran 7.3 again with your example

$ cat best.f90

module M

use ISO_C_BINDING

implicit none

interface

function cbrtf(x) bind(C,name='cbrtf')

import

implicit none

real(C_FLOAT) cbrtf

real(C_FLOAT), value :: x

end function cbrtf

end interface

end module M

program P

use M

implicit none

real x, y

x = 1500.0

y = cbrtf(x)

write(*,'(f10.7)') y

end program P

rwgreen@orcsle147:~/quad/04018265$ !gfortran

gfortran -o best best.f90

rwgreen@orcsle147:~/quad/04018265$ ./best

11.4471426

Right? this is what you want.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Repeat Offender wrote:But what if you asked for cbrtf() instead of powf()? Does this still work in 19?

module M use ISO_C_BINDING implicit none interface function cbrtf(x) bind(C,name='cbrtf') import implicit none real(C_FLOAT) cbrtf real(C_FLOAT), value :: x end function cbrtf end interface end module M program P use M implicit none real x, y x = 1500 y = cbrtf(x) write(*,'(f10.7)') y end program P

Thank you for this solution. While it involves a small amount of recoding, it efficiently gives the correct result and a result that appears to be consistent with that obtained in compiler version 18. Still can't understand how /fp:source concludes that x**(1./.3) means anything but take the cube root, but I guess that is why I am not a compiler developer.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Michael D. wrote:

Still can't understand how /fp:source concludes that x**(1./.3) means anything but take the cube root, but I guess that is why I am not a compiler developer.

You probably meant to write x**(1./3.), not x**(1./.3).

1) At compilation time, it is not known whether x is positive or negative. The expression (1./3.) is treated just the same as any other real valued expression. No special recognition is given for its being the reciprocal of an odd integer.

2) Therefore, the compiler has to emit code that will work for x >=0 as well as for x < 0.

3) If the exponent b is not the exact reciprocal of an odd integer, the expression x**b is undefined when x is negative, unless you move from the field of real numbers into the complex plane. For example, the expression (-27.0)**b does not have a real value (in the mathematical sense) when b = 0.33333333 or b = 0.33333334. The only value of b for which the expression has a real value, namely, the reciprocal of 3, does not have an exact representation in any 2-base or 10-base floating point representation; as stated in 1), no special case rules are known or applied.

The question has less to do with compiler development than with basic mathematics. Perhaps the following example program can help. Please review the code and form a conclusion as to what it should print. THEN, compile and run the program and compare the output to your expectations.

program xcubrt implicit none real :: third = 1.0/3.0 complex :: z ! z=(-1.0,0.0) z=z**third print *,z end program

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Here is what I hope is a constructive suggestion towards attaining the goals exposed in this thread. Instead of using special functions such as CUBRT(x) or CBRTF(x),

1. Use the '**' operator with the absolute value of the argument x.

2. Affix the proper sign to the result.

3. Polish the result with a single step of Newton's method.

Here is example code to obtain the seventh root of -2187.

program sevrt implicit none real :: seventh = 1.0/7.0, y = -2187.0, z, z2, z6, z7 ! z = sign(abs(y)**seventh, y) z2 = z*z z6 = z2*z2*z2 z7 = z6*z print '(ES16.8)',z z = (6*z7 + y)/(7*z6) ! single step of Newton's method print '(ES16.8)',z end program

The output of this program:

-3.00000024E+00 -3.00000000E+00

The last bit of the result (in its IEEE 32-bit representation) from the ** operation is 1. The one-step Newton correction changes that bit to 0.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

mecej4 wrote:Here is what I hope is a constructive suggestion towards attaining the goals exposed in this thread. Instead of using special functions such as CUBRT(x) or CBRTF(x),

1. Use the '**' operator with the absolute value of the argument x.

2. Affix the proper sign to the result.

3. Polish the result with a single step of Newton's method.

Here is example code to obtain the seventh root of -2187.

program sevrt implicit none real :: seventh = 1.0/7.0, y = -2187.0, z, z2, z6, z7 ! z = sign(abs(y)**seventh, y) z2 = z*z z6 = z2*z2*z2 z7 = z6*z print '(ES16.8)',z z = (6*z7 + y)/(7*z6) ! single step of Newton's method print '(ES16.8)',z end programThe output of this program:

-3.00000024E+00 -3.00000000E+00The last bit of the result (in its IEEE 32-bit representation) from the ** operation is 1. The one-step Newton correction changes that bit to 0.

Brilliant demonstration

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

program sevrt USE IFPORT implicit none integer i,j INTEGER, PARAMETER :: dp = selected_real_kind(15, 307) real(KIND=dp) :: seventh = 1.0d0/7.0d0, y = -2187.0, z, z2, z6, z7 REAL (KIND=dp) :: six = 6.0d0, START_TIME, STOP_TIME REAL (KIND=dp) :: seven = 7.0d0, Thousand = 1000.0D0 START_TIME = DCLOCK() ! z = sign(abs(y)**seventh, y) z2 = z*z z6 = z2*z2*z2 z7 = z6*z print '(ES16.8)',z z = (six*z7 + y)/(seven*z6) ! single step of Newton's method print '(ES16.8)',z STOP_TIME = DCLOCK() write(*,100) (STOP_TIME - START_TIME) 100 Format('Program took:',E14.6, ' seconds.') end program

-3.00000000E+00 -3.00000000E+00 Program took: 0.000000E+00 seconds. Press any key to continue . . .

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content

Nichols, John wrote:.. Brilliant demonstration

@Nichols, John:

Forgot to include some emoticons to go with your comment?! Because otherwise, it would indicate you really meant it which would be strange because @mecej4 in the suggestion to use Newton correction seems to have intently meant to convey the exact opposite of your comment as (perhaps) "homage" to some of the absurdities in this thread, considering the Newton suggestion.requires **9 additional operations (7 multiplication, 1 division, 1 addition) operations, 3 additional variables, 2 magic numbers, and a sequence of steps, all of which would require encapsulation in a procedure if it is to be reused for the seventh root case but which requires further algebra if it's to be extended generally, and this just so that the coder is advised by @mecej4 not to use an available, robust library function implementation!! **and not to overlook the accompanying example was different from the current use case which seems to have settled to the one in Quote #12 (again perhaps as misdirection).

Starting with the original post where no clearly reproducible case was presented (see Quote #9 that was unanswered) and where the numeric analysis offered by Ronald Green at Intel and other effort to describe computations taking into account the accuracy requirements vis-a-vis the floating-point representations and so forth have been ignored, it's a real messy thread.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page