Solved: Strange compiler bug?

netphilou31 · ‎02-25-2022

Dear all,

I found a strange compiler bug(?) with the following code:

subroutine ASTMD86_Corrected_to_ASTMD86(ASTMD86Corrected, N, ASTMD86)
!dec$ attributes stdcall, alias:'ASTMD86_Corrected_to_ASTMD86', dllexport :: ASTMD86_Corrected_to_ASTMD86
!dec$ attributes reference :: ASTMD86Corrected, N, ASTMD86
    implicit none

    integer(4), intent(in)  :: N
    real(8),    intent(in)  :: ASTMD86Corrected(N)
    real(8),    intent(out) :: ASTMD86(N)

    integer(4) i, IT

    integer(4) :: ITMAX = 50

    real(8) X, T, F, DFDX

    real(8) :: EPS = 1.D-5
    real(8) :: A = -1.587D0*log(10.D0)
    real(8) :: B = 0.00473D0*log(10.D0)

    ASTMD86(:) = ASTMD86Corrected(:)

    do i=1,N
       T = (ASTMD86Corrected(i)-273.15D0)*9.D0/5.D0 + 32.D0
       if (T > 475.D0) then
           X  = T
           F  = X - T + exp(A+B*X)
           IT = 1
           do while (abs(F) > EPS .and. IT <= ITMAX)
               DFDX = 1.D0 + B*exp(A+B*X)
               X  = X - F/DFDX
               F  = X - T + exp(A+B*X)
               IT = IT + 1
           end do
           ASTMD86(i) = (X-32.D0)*5.D0/9.D0 + 273.15D0
       end if
    end do

end subroutine ASTMD86_Corrected_to_ASTMD86

It appears that the variable A is assigned to -0.0000000D+00, whereas the variable B is correctly evaluated.

I seems that the problem is at least present from the first version of OneAPI. I don't know if it was present in previous versions (XE2019 or XE2020), but it worked in versions prior to OneAPI.

My compiler switches are:

/nologo /O2 /fpp /extend-source:132 /Qsave /assume:byterecl /Qzero /fpe:1 /fp:source /fpconstant /Qfp-speculation=strict /iface:cvf /module:"Win32\Release\\" /object:"Win32\Release\\" /Fd"Win32\Release\vc160.pdb" /libs:static /threads /Qmkl:sequential /c

Fortunately, solving the issue was quite easy, I simply replaced the declaration / initialization statements by manual initialization of the variables A and B before the loop, but that's really weird! I will have to check if no other code is impacted with this problem.

Best regards,

Ron_Green · ‎02-25-2022

Our engineer is triaging this one. It's certainly curious. Interestingly, she found this work around:

real(dp) :: a = (-1.587d0) *log(10.d0)

It affects negative constants, which was pretty obvious from 'b's initialization

View solution in original post

andrew_4619 · ‎02-25-2022

That is quite worrying! How did you verify that, in the debugger? Because I have found the debugger has various bugs in displaying parameter values but the programs are still correct. Have you tried cutting the reproducer down , is there any compiler option that is the "trigger"?

netphilou31 · ‎02-25-2022

Hi,

I found the problem since the calculated array was showing a strange shape for temperatures above 475°F, i.e. when the correction is applied. I did the calculation in an excel worksheet, setting up the same iterative procedure and I got the correct answers. Then I went to the debugger and discovered that the variable A was wrong!

I did not check which compiler switch is responsible of the problem (if any) since I use the same for years (at least in versions prior and after OneAPI was released).

mecej4 · ‎02-25-2022

The bug is present in the current OneAPI ifort and ifx compilers, 32 and 64 bit. The only compiler option that we need to trigger the bug is /fpconstant . The first line of the output shows A as exactly zero, instead of -3.654E+00 .

Here is a reproducer.

program tst
   implicit none
   integer, parameter :: dp = kind(0d0)
!
   real(dp) tin(10) , tout(10)
   integer n , i
   n = 10
   tin = (/(520.0d0+i*5d0,i=1,n)/)
   call astmd86_corrected_to_astmd86(tin,n,tout)
   print '(1x,i2,2x,2f8.2)' , (i,tin(i),tout(i),i=1,n)

contains

   subroutine astmd86_corrected_to_astmd86       &
&     (astmd86corrected, n, astmd86)
   implicit none

   integer , intent(in)  ::  n
   real(dp) , intent(in)  ::  astmd86corrected(n)
   real(dp) , intent(out)  ::  astmd86(n)

   integer i , it

   integer  ::  itmax = 50

   real(dp) x , t , f , dfdx

   real(dp)  ::  eps = 1.d-5
   real(dp)  ::  a = -1.587d0  *log(10.d0)
   real(dp)  ::  b =  0.00473d0*log(10.d0)

   astmd86(:) = astmd86corrected(:)
   print '(1x,a,2es10.3)' , 'a, b = ' , a , b
   do i = 1 , n
      t = (astmd86corrected(i)-273.15d0)*9.d0/5.d0 + 32.d0
      if ( t > 475.d0 ) then
         x = t
         f = x - t + exp(a+b*x)
         it = 1
         do while ( abs(f) > eps .and. it <= itmax )
            dfdx = 1.d0 + b*exp(a+b*x)
            x = x - f/dfdx
            f = x - t + exp(a+b*x)
            it = it + 1
         enddo
         astmd86(i) = (x-32.d0)*5.d0/9.d0 + 273.15d0
      endif
   enddo

   end subroutine astmd86_corrected_to_astmd86
end program tst

JohnNichols · ‎02-25-2022

You have all of the variables set to double precision, I wonder at the internal mistake that makes /fpconstant that makes the analysis perform in double precision be incorrect.

netphilou31 · ‎02-25-2022

Hi Mece,

Thanks for taking the time to reproduce the bug and many thanks for the information.

I used the /fpconstant switch for long to make sure (at least this was I thought) single precision constants are correctly assigned to double precision variables, and I hoped to get more reliability and accuracy. I have seen in the help that it is not complying to fortran 2003 standard. Do you think I should remove it?

Best regards,

mecej4 · ‎02-25-2022

Do you need the /fpconstant option at all? As JohnNichols observed, the test program did not even need that option, but ended up paying a price.

In general, I avoid adding a bunch of compiler options just to get a non-standard Fortran code to work. These options may interact, introduce their own bugs (as we just saw). Their use hinders the use of other compilers that do not have similar options.

Each such option amounts to a band-aid-fix for one issue, and after the project sports a dozen or so such fixes, when things go wrong, what can we do? How many combinations should we drop to isolate the one that causes the problem?

mecej4 · ‎02-27-2022

What follows is an observation unrelated to your bug report, but it may be of interest.

Are you aware that you could, within reasonable intervals of input temperature, be able to replace the Newton-Raphson iterations in your program with explicit solutions? The IF...END IF block in the program may be replaced by:

       if (T > 475.) then
           u = B*exp(A+B*T)
           v = u*((17*u+114)*u+60)/((101*u+174)*u+60)
           X  = T-v/B
           ASTMD86(i) = (X-32.)*5./9. + 273.15
       end if

The rational function for v in terms of u is a Padé approximation to the solution of v exp(v) = u; this approximation may be sufficiently accurate unless T is over 650 F.

netphilou31 · ‎02-27-2022

Hi mecej4,

Thanks for the trick. That could be of interest for other procedures (and I will try to keep this in mind) but in mi case it's likely that temperatures end above 650°C very often, so I prefer to keep the NR method which converges very quickly in a few iterations. Furthermore, this procedure is only used before or after more complex calculations and is not part of a more complex algorithm.

Best regards,

mecej4 · ‎02-25-2022

This boils down to an error in the calculations done by the compiler (not the client program). Without /fpconstant, the relevant line in the assembler file (which can be produced using the option /Fa) is:

2il0floatpacket.4	DD	08aeac850H,0c00d3bceH

whereas, with /fpconstant, the two dword constants are both zero.

That constant, 0c00d3bce8aeac850H = -3.565... is the value of the constant A = -1.587D0*log(10.D0) in the program.

JohnNichols · ‎02-25-2022

I do not use switches unless the compiler complains about something, most often it is the heap size or the stack size. And then I rely on this forum as a check. It happens once every few years.

I almost always use debug mode as luckily I only share with a small group and I am constantly amazed when @mecej4 trots out all of these switches and I shake my head and ask myself I wonder what they do.

ASTM 68 is a fuel code doing some math calcs, I am not going to pay 100 for it or thereabouts, so what was the problem equation do in the code, I understand the math?

I swear Steve speaks a special version of Klingon only a few people understand.

Steve_Lionel · ‎02-25-2022

I don't see anything in this code that /fpconstant should affect, so it's very puzzling that it does. What it is supposed to do is keep constants that don't have a kind specified (D0 as a suffix is a kind in this case) in the highest supported precision before use. For example:

double precision pi = 3.1415926535897

and have all the digits preserved. Many pre-Fortran 90 compilers (including VAX FORTRAN) would do that, but F90 specified that the constant I show is "default real" (or single precision) and you'd lose digits. The example here puts D0 on all of the real constants, so /fpconstant should have nothing to do.

Ron_Green · ‎02-25-2022

bug ID is CMPLRLLVM-35577

Ron_Green · ‎02-25-2022

it's really quite simple to reproduce:

program tst
   implicit none
   integer, parameter :: dp = kind(0d0)
   real(dp)  ::  a = -1.587d0  *log(10.d0)
   real(dp)  ::  b =  0.00473d0*log(10.d0)
   print '(1x,a,2es10.3)' , 'a should be -3.654E+00  a is:  ' , a
   print '(1x,a,2es10.3)' , 'b should be 1.089E-02   b is:  ' , b 
end program tst

and yes, for now, do an assignment for 'a' instead of an initialization expression.

real(dp) :: a
...
a = -1.587d0 *log(10.d0)

Ron_Green · ‎02-25-2022

Our engineer is triaging this one. It's certainly curious. Interestingly, she found this work around:

real(dp) :: a = (-1.587d0) *log(10.d0)

It affects negative constants, which was pretty obvious from 'b's initialization

netphilou31 · ‎02-28-2022

Hi,

Just to come back to your workaround, it seems that the use of an intrinsic function as nothing to do with the bug, it's more likely the use of an expression involving a negative constant.

Indeed, I tried:

    real(8), parameter :: LN_10 = log(10.D0)
    real(8) :: A = -1.587D0*LN_10
    real(8) :: B = 0.00473D0*LN_10

and it doesn't work either.

However,

    real(8), parameter :: LN_10 = log(10.D0)
    real(8) :: A = (-1.587D0)*LN_10
    real(8) :: B = 0.00473D0*LN_10

worked and even without /assume:protect_parens

Best regards,

FortranFan · ‎02-25-2022

@Ron_Green ,

Any insight or feedback you can provide as to why the problem does not occur with named constants?

   integer, parameter :: dp = kind(0d0)
   real(dp), parameter :: a = -1.587_dp  *log(10.0_dp)
   real(dp), parameter :: b =  0.00473_dp*log(10.0_dp)
   print '(1x,a,2es10.3)' , 'a should be -3.654E+00  a is:  ' , a
   print '(1x,a,2es10.3)' , 'b should be 1.089E-02   b is:  ' , b 
end

C:\temp>ifort /standard-semantics /fpconstant p.f90
Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.5.0 Build 20211109_000000
Copyright (C) 1985-2021 Intel Corporation.  All rights reserved.

Microsoft (R) Incremental Linker Version 14.27.29112.0
Copyright (C) Microsoft Corporation.  All rights reserved.

-out:p.exe
-subsystem:console
p.obj

C:\temp>p.exe
 a should be -3.654E+00  a is:  -3.654E+00
 b should be 1.089E-02   b is:   1.089E-02

Ron_Green · ‎02-27-2022

named constants are parsed differently and use a different code path.

FortranFan · ‎02-25-2022

@netphilou31 , please note per the Fortran language standard your use of initialization in the declaration statements also imparts the SAVE attribute implicitly to the variables 'A' and 'B'. This can have adverse effect on thread safety and possibly hinder greatly the ability of such code to be executed in any parallel mode.

You will know well the PARAMETER attribute in Fortran as parts of named constants facility in the language. Consider employing it as much as possible in your code.

jimdempseyatthecove · ‎02-25-2022

>>please note per the Fortran language standard your use of initialization in the declaration statements also imparts the SAVE attribute implicitly to the variables 'A' and 'B'

and as such, the initialization only occurs once at compile time, and not every time you enter the procedure.

IOW should A or B get modified, the next entry to the procedure will observe the last modified value(s).

Jim Dempsey

netphilou31 · ‎02-26-2022

Hi,

to @FortranFan and @jimdempseyatthecove , I know that initializing variables this way turn them as SAVED. However, they are not supposed to change at run time, so I guess that's not a problem even for thread safety (am I wrong?). I also agree that they could be declared as named constants but sometimes I prefer to use this way since it's easier to manipulate them in debug mode (for example to test other values without the need to recompile every time).

The workaround (adding parenthesis around the negative value) is interesting in case there is no other possibilities; I will keep this in mind.

To come back to @Steve_Lionel comment, this was exactly why I used the /fpconstant switch, because some code contains a lot of arrays initialized this way (coming sometimes from students or university partners that don't always take care or know about these things but that's not their job afterall).

I guess I will have to browse our source code to see if such problems are present elsewhere.

Last question in case you already have the answer (if no I could check that using @Ron_Green test program easily): does the problem exist because of the use of the intrinsic function log() (or any other) in the initialization or can it be also present for simple negative values (which might be a disaster in this case)?

Best regards,