Different results that shouldn't differ

tmcole · ‎10-31-2003

I am debugging an updated version of code and comparing against an older version of the code that should be "correct", but I'm doing this on two different machines. One is a P4 and the other is an Athlon, but I have compiled both with the PIII compiler and enable floating point consistency options.

I have tracked the first difference in results between the two codes to the following:

DECAY(K,I) = EXP(DFC*DEPTHB(K,I))

The value for DECAY, DFC, and DEPTHB (double precisioned) in hex are as follows:

P4 machine
DECAY = #3F582ID2C9AA147F
DFC = #C026E1068820B712
DEPTHB = #3FE23D70A3D70A3E

Athlon machine
DECAY = #3F582ID2C9AA1481
DFC = #C026E1068820B712
DEPTHB = #3FE23D70A3D70A3E

Any thoughts as to why DECAY is not the same between the two platforms? My guess is because of the different processors, but I don't know how this could happen since the code is compiled for a PIII CPU on both machines.

In the same vein, does anyone else think that it would be a worthwhile addition to be able to, at any time in the execution, dump out all the variables in a code and then have a postprocessor compare two different results and point to all values that are different between the two results?

This capability would quite literally save me days during code development when upgrading existing code or rewritting existing code to be smaller and more efficient. I would think that a large number of other number cruncher code developers would find this beneficial as well.

Steven_L_Intel1 · ‎10-31-2003

Interesting.

The two results differ by two in the least significant bits. I have no ready explanation as to why they might differ unless there is some LSB difference in one of the processors' implementation of some arithmetic function (such as SQRT, LOG, etc.)

Are you using the SAME executable on the two systems? Is it linked statically or against DLLs?

The "dump" feature you want is easily available to you - it's called NAMELIST. The disadvantage is that you get no control over the format used to display the values, and it might not show a difference between your two DECAY values, but it's quite easy to implement. Note that putting variables in a NAMELIST can affect optimization, as the variables have to be written back to memory before each NAMELIST WRITE.

Steve

tmcole · ‎10-31-2003

Steve,

It gets even more interesting. I have the drive on my P4 machine mapped onto my Athlon. So, I fired up the workspace on the P4 machine on my Athlon machine. I get the exact same results (or rather differences) on the Athlon, so my initial surmise as to the difference being the result of the processors was incorrect. Somehow the compiler is generating the differences.

The codes and therefore the executables are not the same at all, but both are statically linked. The one on the Athlon is a new version that I am checking against the older version on the P4 machine.

This behavior is disconcerting at best and a heck of a problem at worst. I need to be able to fix the new code so that it gives the exact same answers as the old code, but this quirk is making it impossible.

As far as the dump request, I am very familiar with the NAMELIST feature as I have worked on a number of DEC machines over the years, but this is not really much of a solution as I would have to type in all the variable names which number in the thousands, plus, as you noted, I cannot output them in hexadecimal. So, the feature that I would love to have is not really "easily available" in CVF. I guess I'll just have to reconcile myself to it potentially taking a long time to debug the program when I make changes.

Steven_L_Intel1 · ‎10-31-2003

The codes are not the same? Then why do you expect the results to be the same?

Any code difference can change the compiler's decisions about optimizing, and even if you specify /fltconsistency, you will still get some use of double-precision registers. If the compiler decides to do operations in a different order, that can cause LSB differences in results.

Unless you're running the same code, you have to expect tiny differences such as this. It's the reality of fixed-precision floating point.

Steve

tmcole · ‎10-31-2003

Steve,

When I say the codes are different, I mean that I have either added additional functionality, reduced the code size by using cleaner logic, or other code changes that should not have an impact on the solution of the equations of continuity, x-momentum, water surface elevations, velocities, and the equation of state. The solution of these equations should not change regardless of what other changes I make to the code, unless I have introduced a bug, which is the whole point of what I am attempting to do.

What is happening here is that in one of the codes I am getting different answers from the EXP function. I would think this would not be a function of the optimization level. Regardless, I am compiling with the debug options so optimizations should be turned off. If you look at the example I provided, the order of floating point operations is not of concern. What is of concern is that the result of an intrinsic function is different between the two codes, even though in this case this section of the code is exactly the same between the two codes. As in my original post, DFC and DEPTHB have the exact same value, but DECAY, the result of an exponentiation of DFC*DEPTHB, differs in the last two bytes between the two codes. Again, I am at a loss as to how this could occur.

tmcole · ‎10-31-2003

A followup. I rewrote both codes as:

AAA = DFC*DEPTHB(K,I)
DECAY(K,I) = EXP(AAA)

New code

AAA = #C01A15025381748B
DECAY = #3F5821D2C9AA1481

Old code

AAA = #C01A15025381748B
DECAY = #3F5821D2C9AA147F

The results are exactly the same as before I introduced the intermediate variable AAA. The only thing I can think of is that this is a "stored in register versus stored in memory" issue, but I really don't know.

Alternatively, any suggestions as to how to force the same answer.

Steven_L_Intel1 · ‎11-01-2003

Any change you make to the source file can potentially change the instructions, even if you didn't make changes in a particular code sequence. You would have to step through the assembly code in each version and see what was different.

I doubt the EXP is actually returning different values. And even without optimization, the registers are used.

Steve

tmcole · ‎11-03-2003

I must be missing something fundamental here. In the previous example I stored the value to be exponentiated in the AAA variable in both versions of the code, and then passed AAA to the EXP funtion. The result of this exponentation differs in the last two bytes, although the debugger shows that the value passed is exactly the same. This would strongly indicate that this is a "register versus memory" issue, but I'm at a loss since AAA is the same, at least the same when evaluating it in the debugger.

Does anyone know how to force a value to be stored in memory as opposed to in a register in order to see if this is indeed the problem I am encountering? This is a very important issue for me to resolve in order to ensure that I have not introduced bugs in the code as it is continuously undergoing changes. Thanks.

Steven_L_Intel1 · ‎11-03-2003

The values differ in the last two bits, not bytes.

You can try naming the variable in a VOLATILE statement. That should force it to memory at every occasion. Note that some use of registers is unavoidable - that's the way it works in IA-32.

As I suggested earlier, stepping through the relevant code at the assembly level in the debugger would be instructive - you can see the contents of memory and registers and see what's different between the two systems.

Steve

isn-removed200637 · ‎11-03-2003

Why not take LOG(DECAY) for both values and see what values you get?
regards