Different results when running in debug and optimized (O3) mode

Chris_G_2 · ‎09-22-2014

Different results when running in debug and optimized (O3) mode: I have some code that calculates the eigenvalues and eigenvectors of a symmetric positive-definite matrix followed by some simple matrix algebra. The code is embedded in a much larger program. It uses 4 byte word lengths. I get different results when I use the release and debug versions of the program. The differences are very small when the input matrix is small, (say 10x10) but become huge as the matrix gets bigger (to 400x400). How common is this problem and does anyone have advice for how to proceed? Will conversion to 8 byte reals help?

Jauch · ‎09-22-2014

Hello Chris,

The differences between "debug" and "optimized" code are common, due the finite capability of the processor to store "digits" to perform calculations, and the fact the optimization usually leads to changes in the order the calculations happens.

I think (not sure), that there are some compilation options that allow you to avoid those changes in the order of calculations (with performance penalties), and some others related specifically with float point operations.

On the other hand, 4 bytes real numbers (with 7 significant digits, I think), will led to increasing differences along time, due "roundoff's" and the same reason above. So, usually using 8 byte real numbers usually increase precision. But this depends on the kind of calculations you are doing and the values you are using.

Another possibility is related with parallelization. If there are errors in the parallel parts of the code that happens only when running in parallel (usually disabled in Debug mode), you can see big differences when comparing them.

Problems of bigger differences in bigger matrixes than in smaller ones usually happens when you are performing calculations using multiple values instead of single values (like determinants, etc).

Hope this can help you understand from where can come the differences.

Cheers,

Eduardo

Anthony_Richards · ‎09-22-2014

4 Bytes is just not enough to even attempt to maintain accuracy in the inversion algorithm, IMO. Rounding errors will build up, especially with large matrices, so you will do best to take the computational hit involved in going from 4-byte lengths (single precision) to 8-bytes (double precision) as the price for maintaining accuracy. After all, its no good being fast if the results are wrong.

You appear not to question the matrix inversion/eigenvalue algorithm you are using. Is the code you use from one of the tried and much tested standard software packages offered with IVF or is it your own?

Les_Neilson · ‎09-22-2014

If you do move to using double precision then remember to change any real literals (constants) also. Then there is the need to check whether any constants are passed as actual arguments and the corresponding dummy arguments need to be the same precision. When you compile in debug make sure you have all of the check options on. This will help identify problems in passing data around.

Les

Chris_G_2 · ‎09-22-2014

Thank you all for this. I guess I need to try real*8.

Could vectorization be an issue?

Anthony: I am using the Jacobi routine from 'Numerical Recipes'.

ChrisG

jimdempseyatthecove · ‎09-22-2014

Try this program:

program RoundOff
    implicit none
    integer :: i
    real(4) :: A4
    real(4) :: sum4
    real(8) :: A8
    real(8) :: sum8
    A4=0.1   ! infinite repeating binary fraction
    A8=0.1_8
    sum4 = 0.0
    sum8 = 0.0_8
    do i=1,1000000
        sum4 = sum4 + A4
        sum8 = sum8 + A8
        if(mod(i,10000) .eq. 0) then
            write(*,'(I10,X,4F40.20)') i,sum4,i*A4, sum8, i*A8
        endif
    end do
    print *, 'done'
end program RoundOff

Jim Dempsey

Anthony_Richards · ‎09-23-2014

You say you get differences between debug and release versions. Are you saying that you know one is correct but the other diverges from the correct answer? Which one gives the 'correct' evaluation? Or is it possible both are incorrect but their different incorrect values increasingly diverge with array size? If only rounding is the problem, it should be the same in both debug and release versions, so if you get differences between the two, this may be due to your not initialising the arrays fully. In debug mode allocated arrays may be initialised to zero (or some other value) whereas in release mode the contents (other than the elements you explicitly assign values to) can be anything. You should always initialise allocated (or any other) arrays to zero if you are not going to assign a value yourself to every array element in the array.

P.S. You say the code is part of a much larger program. Are you talking about divergences involving just the eigenvale-eigen vector code, or divergences in the results from the 'much larger program'?

Chris_G_2 · ‎09-25-2014

Interestingly I have rewritten the code to work with real*8 instead of real*4 and I get the same answers in debug and release mode for matrices up to 400x400 then the results start to diverge.So I don't think there is an issue due to not intializing variables but just rounding errors.

jimdempseyatthecove · ‎09-25-2014

While there may be little you can do about the accumulation of rounding errors with multiply and divide, there are some simple things you can do about the accumulation of rounding errors with add and subtract (at the expense of additional computation).

The technique is to add and subtract numbers of similar magnitude. For summation, you could create an array of 256 for real(4) or 1024 for real(8). Then using integer functions on the element to be summed extract the exponent only of the number, and use it as the index into the array for performing the summation.

SummationArray(ExtractExponent(X)) = SummationArray(ExtractExponent(X)) + X

integer function ExtractExponent(X)
real :: X
ExtractExponent = EXPONENT(X) + EXPONENT(HUGE(X)) ! bias to produce positive indices

Jim Dempsey

Anthony_Richards · ‎09-25-2014

Please can you define what you mean by 'the same'? Do you mean identical to all visible decimal places, or the same up to the n'th decimal place, after which the answers diverge? You are in a difficult place, because with two answers to the same problem (debug, release), how to decide which is closest to the correct the answer?

Not being conversant with the possible differences that might occur in arithmetic used by debug and release versions, I leave it to others more knowledgeable to advise on that subject.

Chris_G_2 · ‎09-25-2014

The software I am writing and adapting produces 3-D plots. When I say the 'same', I mean that the plots are identical to within 4 decimal places. Given the nature of the uderlying theory that I am using and the type and quality of the data I am working with even 3 decimal places is satisfactory.

Jim Dempsey's point is intersting and I will investigate.