Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
28381 Discussions

same code, different results on different Windows computers

Brian_Murphy
New Contributor II
2,222 Views

I have an intel fortran executable whose only dependencies are two windows files kernel32.DLL and imagehlp.DLL

If I run this on different computers with exactly the same input, should I get exactly the same output?

I'm getting different results from windows 7 and windows 10 computers.  Should they match?

0 Kudos
1 Solution
Brian_Murphy
New Contributor II
2,090 Views

Good news!  I tried again at double precision, and this time I got the code to build & run, and results are quite satisfactory.  The shenanigans between different OS versions as well as different compiler versions, have all gone away.  In addition, Debug without optimizations and Release with /O2 also now produce results which match very nicely.  To make the change from single to double precision, I used the compiler option for default real kind, i.e. /real-size:64.  That is what I tried ten years ago, without success,  but since then I have converted loads of COMMON blocks to Use modules.  That probably helped.

Anyhow, at this point, I'm a very happy camper.

View solution in original post

0 Kudos
13 Replies
mecej4
Honored Contributor III
2,210 Views

Not only can the outputs on different computers be different, but even on a single computer successive runs may give different results, as you can see by compiling and running the code in this recent thread.

The chances of such inconsistencies are diminished when your code conforms with the rules of the language, and the calculation is deterministic.

0 Kudos
JohnNichols
Valued Contributor III
2,205 Views

or you can look at it as the world lies in a mostly Gaussian Universe and sometimes you get the right answer. 

0 Kudos
JohnNichols
Valued Contributor III
2,193 Views

Your comments got me thinking. 

Many years ago in the SciAmer there was the great column on Recreational Mathematics. One of the articles covered the topic of genetic algorithms. The author and it was not Gardener used a sample algorithm for Basic and provided a simple result.  

The problem I had in the late 1980s when I was playing with this code was that the Basic interpreter was a better performer at the Flib problem than the Microsoft Fortran. 

Any way getting repeated numbers on a Random number generator is a real issue for Fortran.  It was a pain at the time.   

Thanks for the memory, this is a great go to sleep at bedtime problem.  You might like the ads. 

0 Kudos
Brian_Murphy
New Contributor II
2,165 Views

Thank you very much for the reply (don't we love working on Sunday).  After some time in the debugger, I have identified a place early in the run where results begin to differ.  I suspect there are more places subsequent to this.  The code uses real*4 variables in an iterative calculation.   The visual studio debugger displays 7 or 8 digits for the real*4 data.  For the first several iterations relevant variables match on the two computers, but then small differences begin to emerge in the last few digits.  It appears that the code running on two computers is doing slightly different things in the digits beyond the advertised precision of real*4.  I should mention that optimizers are off (/Od).

I see this as an issue to be understood for what it is, rather than a problem that needs solving.  I also suspect that single precision arithmetic is more prone to such subtleties than double precision.  Modifying this old code to use double precision variables is tempting, but the differences don't warrant that.

0 Kudos
mecej4
Honored Contributor III
2,155 Views

What were the CPU make/model numbers of the two computers?

0 Kudos
andrew_4619
Honored Contributor II
2,145 Views

I long ago stopped doing any calculations in real(4) and always use only real(8).  Hardware is so much faster and we have so much more memory that for *most* purposes it has no discernible penalty and kicks the issue of accuracy a lot further down he road. This gives me more time to worry about more profitable problems. Though these days all my code is parameterised in respect of real kind is is easy enough to flip kind if you need to.

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,133 Views

mecej4's comment on what is the CPU may be pertinent.

Also, are you compiling the same source on the different machines .OR. running the same binary?

Do the compile time options instruct the compiler to generate runtime selection of Instruction Set Architecture?

Is the floating point rounding mode the same on both systems?

https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top/language-reference/a-to-z-reference/h-to-i/ieee-set-rounding-mode.html

Also, look for uninitialized data (or data initialized to other than expected).

And, if multi-threading is involved, e.g. a MKL-threaded library, the particular function may not have determinant rounding results (though there may be an option to suggest consistency for a given host). For example, a threaded function (supporting an option for preference in consistency) will (may) give consistent results using n threads, my give different consistent results using m threads (n/=m).

Jim Dempsey

0 Kudos
Brian_Murphy
New Contributor II
2,119 Views

HP Probook 650G5 with Intel(R) Core(TM) i7-8565U CPU @ 1.80GHz, 1992 Mhz, 4 Core(s), 8 Logical Processor(s)

The host OS is Windows 10.  I do Windows 7 testing in vmware virtual machines on the same physical hardware.

Using the same binary, I get different results from Windows 10 and 7.

Rounding mode I have never dealt with, so it is whatever is default.  MKL is not used.

I have run it with the check for uninitialized variables turned on (/check:uninit), and none turned up.  I intentionally planted one to check that it would be caught, and it was.

Some years ago I tried to make a version in double precision, and was unsuccessful.  I don't recall if it did not build at all, or would build and not run.

The differences are not huge, but are enough to get my attention and make me wonder why.  So I've spent a lot of time on it.  I think I'm close, but I don't think I'm going to get all the way there.  I suspect there are hidden digits beyond real*4 that are the source of the differences.  This is a complicated and finicky code, and it doesn't take much to cause differences.

Compiling the same source with IVF 13.0 and 19.1 produces binaries which also produce different results when run in the same OS, but that does not surprise me.

Even though it is single precision, I build the code to x86 and x64 versions because it is a DLL called from Excel.  x86 and x64 versions also produce slightly different results.

Anyhow, I've got the answer to my original question.  That results with the same binary on different computers can be different.

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,114 Views

vmware can hide some of the CPU features. I cannot say if that is involved with your problem.

You can use the C/C++ intrinsic _may_i_use_cpu_feature, bit by bit, to query for, and generate a bitmask of allowed features, then print out the hex value.

I do not know why there isn't an __int64 _get_cpu_features() to return the bitmask of allowed features.

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,115 Views

 

extern "C" __int64 _get_cpu_features()
{
  __int64 allowed = 0;
  for(__int64 bit=1; bit; bit <<= 1)
    if(_may_i_use_cpu_feature(bit)) allowed |= bit;
  return allowed;
}

 

Jim Dempsey

0 Kudos
Steve_Lionel
Honored Contributor III
2,101 Views

I'll point you at a presentation I made on the topic of numerical reproducibility. I will also note that the math library does CPU-type dispatching, and if it chooses a different path on the two PCs you could get slightly different results. There is an option /Qimf-arch-consistency:true that should eliminate that difference.

0 Kudos
Brian_Murphy
New Contributor II
2,091 Views

Good news!  I tried again at double precision, and this time I got the code to build & run, and results are quite satisfactory.  The shenanigans between different OS versions as well as different compiler versions, have all gone away.  In addition, Debug without optimizations and Release with /O2 also now produce results which match very nicely.  To make the change from single to double precision, I used the compiler option for default real kind, i.e. /real-size:64.  That is what I tried ten years ago, without success,  but since then I have converted loads of COMMON blocks to Use modules.  That probably helped.

Anyhow, at this point, I'm a very happy camper.

0 Kudos
jimdempseyatthecove
Honored Contributor III
2,066 Views

>> I used the compiler option for default real kind, i.e. /real-size:64

Caution, that makes variables DP, but literals (constants, data initialization) generate to that which it is identified (..._8, ...D..), then up-typed on storage. This may lead to initialization being less that what you expect.

Jim Dempsey

0 Kudos
Reply