- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I just upgrade to to compiler version 19.0.0.117 from 18.0.3.210. In reviewing some of my codes I noticed small discrepancies in the results. Tracing these discrepancies led to the evaluation of the cube root function. In version 18, cube root (exponentiation to 1/3) is handled by cbrtf. In version 19 this changes to powf. This appears to be a reversal of what happened between versions 11 and 12 (see: https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/279705). This simple code snippet can be used to observe the change:
real b real ans_b b= 1500. ans_b= b**(1./3.)
In version 18 the result is 11.4471426
In version 19 the result is 11.4471435
Any reason for this change? Especially since the version 18 implementation seems to give the more accurate answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ronald W Green (Intel) wrote:As a side note, with the 19 compiler, there is no way to force the compiler to use cbrtf() IF you specify -fp-model source. You asked for it, you got it (power function).
But what if you asked for cbrtf() instead of powf()? Does this still work in 19?
module M use ISO_C_BINDING implicit none interface function cbrtf(x) bind(C,name='cbrtf') import implicit none real(C_FLOAT) cbrtf real(C_FLOAT), value :: x end function cbrtf end interface end module M program P use M implicit none real x, y x = 1500 y = cbrtf(x) write(*,'(f10.7)') y end program P
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm wondering why the OP is not using 8-byte reals.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
gib wrote:
I'm wondering why the OP is not using 8-byte reals.
You are not alone! Whilst there is some interesting discussion in this thread the fact is the compiler is giving good SP results IMO. If this precision is not enough then use DP, simples.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Nichols, John wrote:
..Program took: 0.000000E+00 seconds. Press any key to continue . . .
On modern computers, such single calculation checks can hardly reveal anything useful. Now if one is looking at different solvers for particular calculations/simulations, a systematic approach toward computational analyses as outlined in almost any good book on numerical methods and/or algorithms will be necessary. But when one is considering as basic as options for a cube root calculation, there are few performant alternatives to CBRT the C library function, if any at all. However any modifications/tweaks with the POWF (as with Fortran's '**' operator) are going to result in a few additional floating point operations but they are hardly going to be noticeable unless they are consumed in highly compute intensive circumstances in which case the single calculation check in Quote #20 will not help, one could consider more thorough profiling using tools geared for such analysis or at the very least "brute force" the analysis by "instrumenting" a bunch of code to perform various "runs" of differing problem sizes where each run is combined with a number of trials and some form of a statistical "median" in the measured computational times is captured for a comparative study. See an example below which shows all the trivial but time-consuming wiring one has to do even for a simple-minded CPU timing study:
program p use, intrinsic :: iso_c_binding, only : wp => c_float use, intrinsic :: iso_fortran_env, only : dp => real64 implicit none interface function cbrtf( x ) result(r) bind(C, name="cbrtf") use, intrinsic :: iso_c_binding, only : c_float implicit none real(c_float), intent(in), value :: x ! Function result real(c_float) :: r end function cbrtf function cbrtd( x ) result(r) bind(C, name="cbrt") use, intrinsic :: iso_c_binding, only : c_double implicit none real(c_double), intent(in), value :: x ! Function result real(c_double) :: r end function cbrtd end interface interface cbrt procedure cbrtf procedure cbrtd end interface integer i, j, k integer, parameter :: NVALS(*) = [( 100000*i, i=1,20 )] integer, parameter :: NTRIALS = 20 integer :: NumSmallX real(wp), allocatable :: x(:), y1(:), y2(:) real(dp) :: tim1(NTRIALS), tim2(NTRIALS) logical :: ltim(NTRIALS) real(dp) :: AveTim1(size(NVALS)), AveTim2(size(NVALS)) real(dp) :: START_TIME, STOP_TIME print *, "Cube Root Calculations: Computational Time" print "(g0,t10,g0,t26,g0,t51,g0)", "Run", "Num Calcs", "Exp with Newton Corr.", "CBRT" print "(t26,g0,t51,g0)", "CPU Time (seconds)", "CPU Time (seconds)" print "(t26,g0,t51,g0)", "per 1E6 Calcs", "per 1E6 Calcs" Loop_Nvals: do k = 1, size(NVALS) allocate( x(NVALS(k)), y1(NVALS(k)), y2(NVALS(k)) ) Loop_Ntrials: do j = 1, NTRIALS call loadvals( x, expnt=3 ) ! Cube root with '**' operator plus Newton correction blk_root3: block real(wp), parameter :: ONE = 1.0_wp real(wp), parameter :: TWO = 2.0_wp real(wp), parameter :: THREE = 3.0_wp real(wp), parameter :: THIRD = ONE/THREE real(wp), parameter :: TWOTHIRD = TWO/THREE real(wp), parameter :: EPS_X = epsilon(ONE)**THIRD call cpu_time(START_TIME) do i = 1, size(x) y1(i) = sign(abs(x(i))**THIRD, x(i)) !<-- Allow cube root of negative values ! Apply correction using a single step of Newton's method if ( abs(x(i)) > EPS_X ) then ! y1(i) = TWOTHIRD*y1(i) + x(i)/(THREE*y1(i)*y1(i)) end if end do call cpu_time(STOP_TIME) tim1(j) = STOP_TIME - START_TIME end block blk_root3 ! Cube root using C math library function CBRT call cpu_time(START_TIME) do i = 1, size(x) y2(i) = cbrt(x(i)) end do call cpu_time(STOP_TIME) tim2(j) = STOP_TIME - START_TIME end do Loop_Ntrials ltim = .true. ltim(minloc(tim1)) = .false. ; ltim(maxloc(tim1)) = .false. !<-- Throw out best and worst times AveTim1(k) = sum(tim1, mask=ltim)/real(NTRIALS-2, kind=kind(AveTim1))*1E6_dp/ & real(NVALS(k), kind=kind(AveTim1)) ltim = .true. ltim(minloc(tim2)) = .false. ; ltim(maxloc(tim2)) = .false. !<-- Throw out best and worst times AveTim2(k) = sum(tim2, mask=ltim)/real(NTRIALS-2, kind=kind(AveTim2))*1E6_dp/ & real(NVALS(k), kind=kind(AveTim2)) print "(g0,t10,g0,t25,1pf0.3,t50,1pf0.3)", k, NVALS(k), AveTim1(k), AveTim2(k) ! Print some results if ( k == size(NVALS) ) then print *, "Some Results from last run" print "(g0,t22,g0,t37,g0,t57,g0)", "i", "x", "y (Newton Corr.)", "y (CBRT)" NumSmallX = 0 do i = 1, size(x) if ( mod(i,size(x)/10) == 0 ) then print "(i0,t15,f15.7,t30,f15.7,t50,f15.7)", i, x(i), y1(i), y2(i) else if ( x(i) < 0.1_wp ) then if ( NumSmallX < 10 ) then print "(i0,t15,f15.7,t30,f15.7,t50,f15.7)", i, x(i), y1(i), y2(i) NumSmallX = NumSmallX + 1 end if end if end do end if deallocate( x, y1, y2 ) end do Loop_Nvals stop contains subroutine loadvals( x, expnt ) ! Load values into data array with an arbitrary approach ! Pseudo random number generation plus arbitrary scaling real(wp), intent(inout) :: x(:) integer, intent(in) :: expnt call random_number( x ) where ( x > 0.5_wp ) x = real( int(x*100.0_wp)**expnt, kind=kind(x)) else where ( x > 0.1_wp ) x = -real( int(x*100.0_wp)**expnt, kind=kind(x)) end where return end subroutine end program
Now when this code is run with the 2 versions of Intel Fortran compilers mentioned by OP, the output is:
C:\Temp>ifort /fp:source /O2 /heap-arrays0 p.f90 Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R ) 64, Version 18.0.3.210 Build 20180410 Copyright (C) 1985-2018 Intel Corporation. All rights reserved. Microsoft (R) Incremental Linker Version 14.14.26433.0 Copyright (C) Microsoft Corporation. All rights reserved. -out:p.exe -subsystem:console p.obj C:\Temp>p.exe Cube Root Calculations: Computational Time Run Num Calcs Exp with Newton Corr. CBRT CPU Time (seconds) CPU Time (seconds) per 1E6 Calcs per 1E6 Calcs 1 100000 .087 .000 2 200000 .087 .000 3 300000 .058 .029 4 400000 .065 .022 5 500000 .087 .035 6 600000 .043 .058 7 700000 .111 .000 8 800000 .076 .054 9 900000 .096 .000 10 1000000 .061 .061 11 1100000 .079 .063 12 1200000 .079 .058 13 1300000 .060 .080 14 1400000 .068 .062 15 1500000 .075 .064 16 1600000 .087 .060 17 1700000 .087 .061 18 1800000 .072 .043 19 1900000 .078 .068 20 2000000 .078 .065 Some Results from last run i x y (Newton Corr.) y (CBRT) 1 -1000.0000000 -10.0000000 -10.0000000 2 -8000.0000000 -20.0000000 -20.0000000 6 0.0280244 0.3037472 0.3037472 10 -59319.0000000 -39.0000000 -39.0000000 11 -79507.0000000 -43.0000000 -43.0000000 15 0.0987109 0.4621557 0.4621557 18 0.0132428 0.2365883 0.2365883 21 -39304.0000000 -34.0000000 -34.0000000 23 -9261.0000000 -21.0000000 -21.0000000 24 -4913.0000000 -17.0000000 -17.0000000 200000 571787.0000000 83.0000000 83.0000000 400000 -1000.0000000 -10.0000000 -10.0000000 600000 216000.0000000 60.0000000 60.0000000 800000 -42875.0000000 -35.0000000 -35.0000000 1000000 -17576.0000000 -26.0000000 -26.0000000 1200000 804357.0000000 93.0000000 93.0000000 1400000 -10648.0000000 -22.0000000 -22.0000000 1600000 571787.0000000 83.0000000 83.0000000 1800000 -54872.0000000 -38.0000000 -38.0000000 2000000 -85184.0000000 -44.0000000 -44.0000000 C:\Temp>
C:\Temp>ifort /fp:source /O2 /heap-arrays0 p.f90 Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R ) 64, Version 19.0.1.144 Build 20181018 Copyright (C) 1985-2018 Intel Corporation. All rights reserved. Microsoft (R) Incremental Linker Version 14.14.26433.0 Copyright (C) Microsoft Corporation. All rights reserved. -out:p.exe -subsystem:console p.obj C:\Temp>p.exe Cube Root Calculations: Computational Time Run Num Calcs Exp with Newton Corr. CBRT CPU Time (seconds) CPU Time (seconds) per 1E6 Calcs per 1E6 Calcs 1 100000 .347 .000 2 200000 .217 .043 3 300000 .173 .058 4 400000 .152 .065 5 500000 .173 .087 6 600000 .173 .000 7 700000 .136 .099 8 800000 .173 .065 9 900000 .173 .077 10 1000000 .156 .061 11 1100000 .150 .055 12 1200000 .173 .079 13 1300000 .153 .093 14 1400000 .173 .093 15 1500000 .156 .087 16 1600000 .184 .098 17 1700000 .189 .092 18 1800000 .193 .072 19 1900000 .178 .073 20 2000000 .173 .078 Some Results from last run i x y (Newton Corr.) y (CBRT) 1 -1000.0000000 -10.0000000 -10.0000000 2 -8000.0000000 -20.0000000 -20.0000000 6 0.0280244 0.3037472 0.3037472 10 -59319.0000000 -39.0000000 -39.0000000 11 -79507.0000000 -43.0000000 -43.0000000 15 0.0987109 0.4621557 0.4621557 18 0.0132428 0.2365883 0.2365883 21 -39304.0000000 -34.0000000 -34.0000000 23 -9261.0000000 -21.0000000 -21.0000000 24 -4913.0000000 -17.0000000 -17.0000000 200000 571787.0000000 83.0000000 83.0000000 400000 -1000.0000000 -10.0000000 -10.0000000 600000 216000.0000000 60.0000000 60.0000000 800000 -42875.0000000 -35.0000000 -35.0000000 1000000 -17576.0000000 -26.0000000 -26.0000000 1200000 804357.0000000 93.0000000 93.0000000 1400000 -10648.0000000 -22.0000000 -22.0000000 1600000 571787.0000000 83.0000000 83.0000000 1800000 -54872.0000000 -38.0000000 -38.0000000 2000000 -85184.0000000 -44.0000000 -44.0000000 C:\Temp>
In addition to the obvious point as stated upthread a library function for cube root should be a no-brainer for such a specific computational need compared to the more generic exponentiation operation and so any additional "correction" operations will incur further cost, however its impact is relative to one's needs.
But now if Ronald Green or other Intel staff are looking at this, they can try to explain the considerable differences in CPU times with "**" at larger problem sizes, as shown above, with Intel Fortran compiler 19.0 update 1 relative to 18.0 update 3 when /fp:source compiler option is in effect. Both the cases should be using the same power function call with single precision (C_FLOAT), so why is there such a difference?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi FortranFan,
Am I wrong to be surprised by the occurrence of CBRT times of .000? What do you see when you write more decimal places?
Cheers, Gib
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
gib wrote:Hi FortranFan,
Am I wrong to be surprised by the occurrence of CBRT times of .000? What do you see when you write more decimal places?
Cheers, Gib
Hi Gib,
The zero values are due to limitations with CPU_TIME intrinsic, a better way always is to work with tick rates for CPU timing studies using the SYSTEM_CLOCK intrinsic. See a modified example below, there is also one other change - I'd meant to print more values for when abs(x) is small, say <0.1:
program p use, intrinsic :: iso_c_binding, only : wp => c_float use, intrinsic :: iso_fortran_env, only : dp => real64, I8 => int64 implicit none interface function cbrtf( x ) result(r) bind(C, name="cbrtf") use, intrinsic :: iso_c_binding, only : c_float implicit none real(c_float), intent(in), value :: x ! Function result real(c_float) :: r end function cbrtf function cbrtd( x ) result(r) bind(C, name="cbrt") use, intrinsic :: iso_c_binding, only : c_double implicit none real(c_double), intent(in), value :: x ! Function result real(c_double) :: r end function cbrtd end interface interface cbrt procedure cbrtf procedure cbrtd end interface integer i, j, k integer, parameter :: NVALS(*) = [( 100000*i, i=1,20 )] integer, parameter :: NTRIALS = 20 integer :: NumSmallX real(wp), allocatable :: x(:), y1(:), y2(:) real(dp) :: tim1(NTRIALS), tim2(NTRIALS) logical :: ltim(NTRIALS) real(dp) :: AveTim1(size(NVALS)), AveTim2(size(NVALS)) real(dp) :: START_TIME, STOP_TIME print *, "Cube Root Calculations: Computational Time" print "(g0,t10,g0,t26,g0,t51,g0)", "Run", "Num Calcs", "Exp with Newton Corr.", "CBRT" print "(t26,g0,t51,g0)", "CPU Time (seconds)", "CPU Time (seconds)" print "(t26,g0,t51,g0)", "per 1E6 Calcs", "per 1E6 Calcs" Loop_Nvals: do k = 1, size(NVALS) allocate( x(NVALS(k)), y1(NVALS(k)), y2(NVALS(k)) ) Loop_Ntrials: do j = 1, NTRIALS call loadvals( x, expnt=3 ) ! Cube root with '**' operator plus Newton correction blk_root3: block real(wp), parameter :: ONE = 1.0_wp real(wp), parameter :: TWO = 2.0_wp real(wp), parameter :: THREE = 3.0_wp real(wp), parameter :: THIRD = ONE/THREE real(wp), parameter :: TWOTHIRD = TWO/THREE real(wp), parameter :: EPS_X = epsilon(ONE)**THIRD call cpu_t(START_TIME) do i = 1, size(x) y1(i) = sign(abs(x(i))**THIRD, x(i)) !<-- Allow cube root of negative values ! Apply correction using a single step of Newton's method if ( abs(x(i)) > EPS_X ) then ! y1(i) = TWOTHIRD*y1(i) + x(i)/(THREE*y1(i)*y1(i)) end if end do call cpu_t(STOP_TIME) tim1(j) = STOP_TIME - START_TIME end block blk_root3 ! Cube root using C math library function CBRT call cpu_t(START_TIME) do i = 1, size(x) y2(i) = cbrt(x(i)) end do call cpu_t(STOP_TIME) tim2(j) = STOP_TIME - START_TIME end do Loop_Ntrials ltim = .true. ltim(minloc(tim1)) = .false. ; ltim(maxloc(tim1)) = .false. !<-- Throw out best and worst times AveTim1(k) = sum(tim1, mask=ltim)/real(NTRIALS-2, kind=kind(AveTim1))*1E6_dp/ & real(NVALS(k), kind=kind(AveTim1)) ltim = .true. ltim(minloc(tim2)) = .false. ; ltim(maxloc(tim2)) = .false. !<-- Throw out best and worst times AveTim2(k) = sum(tim2, mask=ltim)/real(NTRIALS-2, kind=kind(AveTim2))*1E6_dp/ & real(NVALS(k), kind=kind(AveTim2)) print "(g0,t10,g0,t25,1pf0.3,t50,1pf0.3)", k, NVALS(k), AveTim1(k), AveTim2(k) ! Print some results if ( k == size(NVALS) ) then print *, "Some Results from last run" print "(g0,t22,g0,t37,g0,t57,g0)", "i", "x", "y (Newton Corr.)", "y (CBRT)" NumSmallX = 0 do i = 1, size(x) if ( mod(i,size(x)/10) == 0 ) then print "(i0,t15,f15.7,t30,f15.7,t50,f15.7)", i, x(i), y1(i), y2(i) else if ( abs(x(i)) < 0.1_wp ) then if ( NumSmallX < 10 ) then print "(i0,t15,f15.7,t30,f15.7,t50,f15.7)", i, x(i), y1(i), y2(i) NumSmallX = NumSmallX + 1 end if end if end do end if deallocate( x, y1, y2 ) end do Loop_Nvals stop contains subroutine loadvals( x, expnt ) ! Load values into data array with an arbitrary approach ! Pseudo random number generation plus arbitrary scaling real(wp), intent(inout) :: x(:) integer, intent(in) :: expnt call random_number( x ) where ( x > 0.5_wp ) x = real( int(x*100.0_wp)**expnt, kind=kind(x)) else where ( x > 0.1_wp ) x = -real( int(x*100.0_wp)**expnt, kind=kind(x)) end where return end subroutine subroutine cpu_t( time ) !.. Argument list real(dp), intent(inout) :: time !.. Local variables integer(I8) :: tick integer(I8) :: rate call system_clock (tick, rate) time = real(tick, kind=kind(time) ) / real(rate, kind=kind(time) ) return end subroutine cpu_t end program
Here's an output of an execution with 18.0.3:
C:\Temp>p.exe Cube Root Calculations: Computational Time Run Num Calcs Exp with Newton Corr. CBRT CPU Time (seconds) CPU Time (seconds) per 1E6 Calcs per 1E6 Calcs 1 100000 .061 .072 2 200000 .081 .072 3 300000 .083 .061 4 400000 .081 .061 5 500000 .078 .064 6 600000 .077 .065 7 700000 .081 .065 8 800000 .079 .064 9 900000 .078 .065 10 1000000 .079 .062 11 1100000 .079 .063 12 1200000 .078 .064 13 1300000 .079 .065 14 1400000 .080 .063 15 1500000 .077 .066 16 1600000 .079 .064 17 1700000 .076 .064 18 1800000 .078 .065 19 1900000 .078 .064 20 2000000 .077 .064 Some Results from last run i x y (Newton Corr.) y (CBRT) 6 0.0280244 0.3037472 0.3037472 15 0.0987109 0.4621557 0.4621557 18 0.0132428 0.2365883 0.2365883 26 0.0402600 0.3427345 0.3427345 40 0.0630255 0.3979595 0.3979595 64 0.0813574 0.4333103 0.4333102 68 0.0097841 0.2138815 0.2138815 69 0.0658390 0.4037952 0.4037952 72 0.0442022 0.3535748 0.3535747 75 0.0902116 0.4484914 0.4484914 200000 571787.0000000 83.0000000 83.0000000 400000 -1000.0000000 -10.0000000 -10.0000000 600000 216000.0000000 60.0000000 60.0000000 800000 -42875.0000000 -35.0000000 -35.0000000 1000000 -17576.0000000 -26.0000000 -26.0000000 1200000 804357.0000000 93.0000000 93.0000000 1400000 -10648.0000000 -22.0000000 -22.0000000 1600000 571787.0000000 83.0000000 83.0000000 1800000 -54872.0000000 -38.0000000 -38.0000000 2000000 -85184.0000000 -44.0000000 -44.0000000 C:\Temp>
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
andrew_4619 wrote:Quote:
gib wrote:
I'm wondering why the OP is not using 8-byte reals.
You are not alone! Whilst there is some interesting discussion in this thread the fact is the compiler is giving good SP results IMO. If this precision is not enough then use DP, simples.
Using DP doesn't resolve the problem, it just pushes to less significant digits. I am not expecting results to 7 or even 6 significant digits. What I would like is that the internal values (which is why I reported hex values for numbers) don't change between compilers and are accurate. These values are propagated within the code at maximum precision and a 1 to 5 bit change can become noticeable after many many sequential calculations. I shouldn't have to use DP to mitigate this type of behavior, it is an inelegant coding technique reminiscent of two to three decades ago and increases the memory footprint and processor time significantly when used universally.
summary - compiler changes implemented less accurate resolution of floating point math which can propagate through long sequential calculations that can be observed in changes in the final resulted presented in decimal form.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Michael D. wrote:What I would like is that the internal values (which is why I reported hex values for numbers) don't change between compilers and are accurate.
In the famous words of Dread Pirate Roberts, "Learn to live with disappointment.". There are many things that can change computed values, including vectorization and even stack alignment. Also order of operations, which is not completely in your control, can yield small differences. See sc13.supercomputing.org/sites/default/files/WorkshopsArchive/pdfs/wp129s1.pdf for some exploration of this.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Michael D. wrote:
Using DP doesn't resolve the problem, it just pushes to less significant digits. I am not expecting results to 7 or even 6 significant digits..........
Using DP clearly DOES resolve the problem because it would appear to give results consistent to the level of accuracy you state you require.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve Lionel (Ret.) wrote:.. In the famous words of Dread Pirate Roberts, "Learn to live with disappointment.". There are many things that can change computed values, including vectorization and even stack alignment. Also order of operations, which is not completely in your control, can yield small differences. See sc13.supercomputing.org/sites/default/files/WorkshopsArchive/pdfs/wp129s1.pdf for some exploration of this.
@Michael D.,
You may be familiar with this article from Intel staff: https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-compiler/
You are making a case for yourself as stated in the article, "it is .. useful to have a degree of reproducibility that goes beyond the inherent accuracy of a computation. Some software .. require .. even bit-for-bit, agreement between results before and after software changes, even though the mathematical uncertainty in the result of the computation may be considerably larger. The right compiler options can deliver consistent, closely reproducible results ..."
Are you following all the recommendations in this article? If yes, are you still noticing differences? If so, see Quote #7 where Steve Lionel advises you to submit a support request with Intel (https://supporttickets.intel.com/?lang=en-US). Have you submitted such a request?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Just out of interest, I tried two other compilers; Silverfrost FTN95 (Win32 and x64) and gFortran x64. To report the value, I used:
write (*,fmt='(1x,a,f0.7)') 'the result is ',ans_b
This was because "write (*,*) ans_b" only reported 4 sig figures (11.4471) for FTN95, but 7 sig figures for gFortran (11.4471436)
In all 3 cases with f0.7, I got " the result is 11.4471436"
This is a different approach to rounding up for 32-bit real !
I suspect the only difference between Ver 18 and Ver 19 is in rounding in "write (*,*)"
Given the difference reported in the OP post is due only to different rounding in write (*,*), how does this affect how the program runs ?
To attribute a change only to the 23rd bit, as causing significant problems for the calculation would imply that 32-bit reals are a poor choice.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
John Campbell wrote:.. how does this affect how the program runs ?
To attribute a change only to the 23rd bit, as causing significant problems for the calculation would imply that 32-bit reals are a poor choice.
That's not the issue, the problem is one of consistency and reproducibility and what Intel Fortran compiler team intends to offer its users vis-a-vis the programmer expectations and needs.
OP has stated certain needs in this thread that some Intel staff has separately acknowledged, as in:
Programmers of floating-point applications typically have the following objectives: .. • Reproducibility o Produce consistent results: .. - From one compiler to another .. Some software quality assurance tests may require close, or even bit-for-bit, agreement between results before and after software changes, even though the mathematical uncertainty in the result of the computation may be considerably larger. The right compiler options can deliver consistent, closely reproducible results while preserving good (though not optimal) performance. .. Bottom Line .. In the version 17 compiler or later, best reproducibility may be obtained with the single switch /fp:consistent (Windows) .. which sets all of the above options.
Readers will see this with the instruction mentioned in the original post:
C:\Temp>type p.f90 character(len=*), parameter :: fmtg = "(*(g0))" character(len=*), parameter :: fmth = "(g0,z0)" character(len=*), parameter :: fmtb = "(g0,b0)" blk1: block integer, parameter :: WP = selected_real_kind( p=6 ) real(WP), parameter :: THIRD = 1.0_wp/3.0_wp real(WP) :: x, y print fmtg, "blk1: x, y with REAL kind=", kind(x), " and precision=", prec ision(x) x = 1500.0_wp y = x**THIRD print fmtg, "x = ", x print fmtg, "y = x**THIRD = ", y print fmth, "y (HEX) = ", y print fmtb, "y (BINARY) = ", y end block blk1 print * blk2: block integer, parameter :: WP = selected_real_kind( p=12 ) real(WP), parameter :: THIRD = 1.0_wp/3.0_wp real(WP) :: x, y print fmtg, "blk2: x, y with REAL kind=", kind(x), " and precision=", prec ision(x) x = 1500.0_wp y = x**THIRD print fmtg, "x = ", x print fmtg, "y = x**THIRD = ", y print fmth, "y (HEX) = ", y print fmtb, "y (BINARY) = ", y end block blk2 end C:\Temp>ifort /fp:consistent p.f90 Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R ) 64, Version 18.0.3.210 Build 20180410 Copyright (C) 1985-2018 Intel Corporation. All rights reserved. Microsoft (R) Incremental Linker Version 14.14.26433.0 Copyright (C) Microsoft Corporation. All rights reserved. -out:p.exe -subsystem:console p.obj C:\Temp>p.exe blk1: x, y with REAL kind=4 and precision=6 x = 1500.000 y = x**THIRD = 11.44714 y (HEX) = 4137277F y (BINARY) = 1000001001101110010011101111111 blk2: x, y with REAL kind=8 and precision=15 x = 1500.000000000000 y = x**THIRD = 11.44714242553332 y (HEX) = 4026E4EFDA1CA3B2 y (BINARY) = 100000000100110111001001110111111011010000111001010001110110010 C:\Temp>
whereas with Intel Fortran 19.0.1:
C:\Temp>ifort /fp:consistent p.f90 Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R ) 64, Version 19.0.0.117 Build 20180804 Copyright (C) 1985-2018 Intel Corporation. All rights reserved. Microsoft (R) Incremental Linker Version 14.14.26433.0 Copyright (C) Microsoft Corporation. All rights reserved. -out:p.exe -subsystem:console p.obj C:\Temp>p.exe blk1: x, y with REAL kind=4 and precision=6 x = 1500.000 y = x**THIRD = 11.44714 y (HEX) = 41372780 y (BINARY) = 1000001001101110010011110000000 blk2: x, y with REAL kind=8 and precision=15 x = 1500.000000000000 y = x**THIRD = 11.44714242553332 y (HEX) = 4026E4EFDA1CA3B1 y (BINARY) = 100000000100110111001001110111111011010000111001010001110110001 C:\Temp>
OP finds the above shown inconsistency in block blk1 (y=z'41372780' vs y=z'4137277F') between the two Intel Fortran versions unacceptable: the matter of interest to me is what support Intel Fortran team will provide to customers in such a situation!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FF, Great presentation. The only thing I can add is:
The one bit difference, accumulated in same magnitude numbers (one example is to add to self), after 1 million iterations will differ by 20 bits from the lsb. IOW the results of a simulation could dieviate significantly. This in no way indicates as which is more correct, rather the results differ. Different results can mean either minor change (e.g. single bit difference) in operation .OR. something wrong with compilation (bug). When results of an operation differ, then it becomes problematic to determine if the difference is due to a bug or not. When the application consists of 100's or 1000's of source files, this places a tremendous burden on the certification of the new compiler.
Jim Dempsey
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »