Different results in debug and release - Page 2

Martin_B_2 · ‎06-24-2015

Running my program in ivf release mode returns other results than ivf debug mode, cvf debug mode and cvf release mode (their results are all the same). After a long search I found that the compiler option /check:all makes the difference.

As I read in several other topics the at least only reason for different results is an coding error. Though my program ist realy long, I would like to try to get these errors out. But compiling and linking it gives me no warning and no error message at all. Is there an option to get more feedback?

Is there a possibility to say which results are the right ones?

andrew_4619 · ‎06-25-2015

errm, all the result in #3 lie between -1 and 1 non are 'greater than 1' . The first and last have E-02 and E-015 at the end which may have been overlooked!

Martin_B_2 · ‎06-26-2015

Why does the debug mode accept "call getlog()" and "call System()", but the release mode doesn't?

TimP · ‎06-26-2015

Martin B. wrote:

Why does the debug mode accept "call getlog()" and "call System()", but the release mode doesn't?

These functions need explicit interfaces. Call system isn't supported but may appear to work sometimes without use ifport.

Martin_B_2 · ‎06-26-2015

I have just checked the two results. Sorry, but I have to say, that the debug results seem to be really better. The value of my variable zw has to be greater than 1 in the last loop, as calculated in debug mode - but not in release.

mecej4 · ‎06-26-2015

For what it is worth, I point out that the last values of ZW-1 shown in #3 are as follows (ε_m.is machine epsilon):

1. Release mode: ZW-1= -2.653433028854124E-014 = -119.5 ε_m;

2. Debug mode: ZW-1= 2.220446049250313E-015 = 10 ε_m.

To see this, run the following program:

program pv
implicit none
double precision x,zwm1R,zwm1D
x=epsilon(0d0)
zwm1R = -2.653433028854124D-014
zwm1D =  2.220446049250313D-015
write(*,*)zwm1R/x,zwm1D/x
end

Thus, Arjen and Luigi were correct in their conjectures that the discrepancies are related to machine epsilon. As Luigi hinted, it is not at all unusual for the results of a calculation to be "off" by 1, 10 or even 100 times ε_m. Floating point arithmetic is only an approximation to real arithmetic. Sometimes, the calculations do not obey the usual rules such as a + b + c = c + b + a. You should note this and recognize that these issues are quite distinct from the effects of compiler optimization and are very unlikely to be caused by undefined variables.

jimdempseyatthecove · ‎06-26-2015

The difference in the numerical result (last one) may not be significant depending on what ZW-1 represents. In looking at the last value, the sign differs in addition to the magnitude.

If your code is correct, and it may very well be, the difference you observe could possibly be explained by a convergence routine exiting an odd number of iterations different in one run than in the other, .AND. ZW-1 is what you use to represent an error estimation. Even if ZW-1 is not an error estimation, I suspect the difference is due to a convergence loop issue. Typically these things happens convergence threshold values are too small. This would be smaller than the number of error bits in the mantissa. Note, although for any single arithmetic operation, the error could potentially be 1/2 the least significant bit, the error from one operation can (will) carry into the next. Meaning iterative processes, like convergence routines, must account for the potential for the potential bits in error to grow at each step. A rough estimate is LOG2(numberOfIterations * PotentialNumberOfBitsInErrorPerIteration). If for example an iteration could potentially have an estimated error of 1/2 lsb, and it takes 512 iterations, you could potentially have the least significant 8 bits in error.

Martin B. wrote:

As I wrote my program is very big and I think the one difference is only the peak of the problem.

At this point the program is in a loop and it calculates the variable zw. This variable is then passed to the function dasin. For that it should not be greater than 1.

In release mode it has these values:

ZW-1= -1.787382367634882E-002
ZW-1= -0.828064706180674
ZW-1= -0.828064706162865
ZW-1= -0.999999999999615
ZW-1= -1.00000000000000
ZW-1= -1.00000000000000
ZW-1= -0.999999999994462
ZW-1= -0.820298874933507
ZW-1= -0.820298874878552
ZW-1= -0.637936726897861
ZW-1= -0.676805781054647
ZW-1= -2.653433028854124E-014

So everything is just fine. But in cvf and in debug mode of ivf two values are different (the first and the last one) and the last one is even greater than 1:

ZW-1= -1.787382367637202E-002
ZW-1= -0.828064706180674
ZW-1= -0.828064706162865
ZW-1= -0.999999999999615
ZW-1= -1.00000000000000
ZW-1= -1.00000000000000
ZW-1= -0.999999999994462
ZW-1= -0.820298874933507
ZW-1= -0.820298874878552
ZW-1= -0.637936726897860
ZW-1= -0.676805781054647
ZW-1= 2.220446049250313E-015

andrew_4619 · ‎06-27-2015

As has been intimated by others the difference in the release and debug values quoted is 5/16th of 3/8th of diddly squat looking at the deltas in the table above from cut n paste into excel. If those differences are significant then the methods used in your code need to be looked at quite hard!

mecej4 · ‎06-27-2015

app4619 wrote:

... the difference in the release and debug values quoted is 5/16th of 3/8th of diddly squat

I thought the UK did not use medieval units anymore, except for the use of stones to quote the weights of people; perhaps there are other exceptions? I remember squinting years ago at a 6 inch stainless steel workshop pocket ruler, marked "Made in England", graduated in 1/128-ths of an inch!

jimdempseyatthecove · ‎06-28-2015

Here is an example:

    subroutine DoWork(argInOut, argIn)
    implicit none
        real(8), intent(inout) :: argInOut
        real(8), intent(in) :: argIn
        argInOut = argInOut * argIn !do work
    end subroutine DoWork
        
    program ErrorInHalfLsb
    implicit none
    real(8), parameter :: factor = 1.0_8 / 10.0_8   ! has error in lsb
    real(8) :: var
    integer :: I
    var = 1.0D+128
    do I=1,128
        call DoWork(var, factor)
        if(var .lt. 0.0_8) print *,'Not going to happen'
    end do
    write(*,'(G24.18,"  ",Z20)') var, var
    end program ErrorInHalfLsb

output

 1.00000000000000688          3FF000000000001F

And the estimated error is log2(128 * 0.5) = 6

The observed error is 5 bits (the 1F at the tail end), or you could optionally say the observed error is 5.5 bits because there may be just under 1/2 bit additional error past the 1F.

Jim Dempsey

andrew_4619 · ‎06-29-2015

LOL The only 'imperial' unit that I use is miles when driving as that is rather enforced by the speed dial in the car and the road signs. I often smile when working with clients in the US as they quite often want to work in 'English units' which seems quite ironic to me. A more accurate description might be "English units immediately prior to 1824".

mecej4 · ‎06-29-2015

App4619: Years ago, I gave a student at a US university a reference to a paper in JFM (Journal of Fluid Mechanics) and casually mentioned, "that's a British journal", and the student protested, quite seriously, "But I can only read English -- I never learned any British in school!"