Solved: Different result between Microsoft PowerStation and Intel Fortran Compiler

siyuliben · ‎02-24-2021

Hi,

I am compiling an old model written by Fortran 77 using oneAPI HPC toolkit (Intel Fortran Compiler).

The source code was written more than 10 years ago and compiled by Microsoft PowerStation.The exe file compiled by Microsoft PowerStation runs the example without any problem. However, the exe file compiled by Intel Fortran Compiler creates ‘NaN’ results in the output and the iterations from the output are different from the output created by Microsoft PowerStation.

Does anyone know the differences between Microsoft PowerStation and Intel Fortran Compiler. Can anyone help me with this problem? I want to compile the source code using Intel Fortran Compiler.

Thank you very much.

I attached the source code, exe file (PS: Microsoft PowerStation, IFC: Intel Fortran Compiler), and the example. One can copy the exe file under the example folder, and the model can be run.

I download the source code and example from the following website:

https://www.pc-progress.com/en/Default.aspx?downloads

Under: SWMS_3D - code for simulating water flow and solute transport in three-dimensional variably saturated media. Download program or manual (2.7 Mb).

mecej4 · ‎02-26-2021

Siyuliben,

I found several other errors in the sources, most of them involving the usage of undefined variables, variables not saved when required, etc., and fixed them. After testing the program with several compilers, and different FPU options, I concluded that the program will not produce satisfactory results if 32-bit reals are used, except if the x87 FPU is used, and the compiled code keeps results in x87 registers as long as possible. The NaNs that we saw are caused by rapid overflow, underflow and divisions by zero when solving simultaneous linear equations thousands of times using the insufficient 32-bit floating point arithmetic.

To overcome this problem, I converted all real variables to double precision. The attached Zip file contains the modified source files. I have done some spot checks of the results after compiling with Intel Fortran and Gfortran. Please try the code on as many test problems as you wish and let me know what you find. If things come out OK, we could inform the maintainers of the program.

View solution in original post

Steve_Lionel · ‎02-24-2021

Can you describe the differences between a wind-up Timex watch and an Apple Watch?

Microsoft Fortran PowerStation went off the market in 1997 - a full twenty-four years ago. It was succeeded by Digital Visual Fortran, Compaq Visual Fortran and now Intel Visual Fortran. The differences are so many they cannot be elaborated.

There are also so many reasons for numerical differences that these cannot be elaborated. Any time you change compilers, or processors, or compiler versions, there is the potential for floating point differences. In particular, Microsoft Fortran PowerStation was from the Pentium/486 era and it used "x87" floating point instructions. Intel Fortran requires a Pentium IV at a minimum, and uses the SSE FP instructions that can have subtly different results due to more regular treatment of precision.

I built and ran the program, but didn't spot any NaN results and am not sure what I would be looking for. It is a large and complex program that I do not understand. I did note that even with the setting to stop on floating point errors, I saw none.

mecej4 · ‎02-24-2021

It seems to me that the SWMS_3D program has bugs of such a nature that it is not even reasonable to compare the outputs from two runs with the same inputs, let alone the results obtained with two different computers. Consider the DO 20 loop in INPUT3.FOR. Inside that loop, there are several statements similar to:

if(abs(i-j).gt.MB) MB=abs(i-j)

However, the variable MB was never initialized before entering the loop, so the behavior of the program and the output are suspicious at best, and could be nonsense after a certain stage. Did the authors rely on MB being initialized to zero? Would that initialization lead to correct behavior? Who knows?

In Fortran Powerstation, all variables were given the SAVE attribute by default. Most modern Fortran compilers do not do so, and you have to explicitly specify the SAVE attribute as needed, or use a compiler option to the same effect.

siyuliben · ‎02-24-2021

Hi Mecej4,

Thank you very much for your answer.

So you think using the default value of the variables might be the reason why they are different.

Could you please teach me how to use the compiler option to set those variables (if I find the variables. I think MB is one of the variables).

mecej4 · ‎02-24-2021

I do not know what the correct results are supposed to be. Before using such old programs, you need to take the trouble to verify if the results that FPS4 gave are repeatable and correct. Only after that can you ask how to use options such as /Qzero so as to make Ifort give you similar results. As Steve pointed out, even then the results can differ slightly because the FPUs are different.

If you have access to the authors of SWMS_3D, you can ask them what options they used with FPS4. I do not know enough about the application area to tell you what to do -- I can however warn you that there are some things wrong with the program.

P.S.(added 25 Feb 2021) :

By using the compiler options /Qzero /Qsave, I obtained results with the OneAPI compilers (Ifort and Ifx) on Windows 10) that closely matched the results given by the FPS4-compiled EXE. CAUTION: agreement does not imply correctness!

As has been pointed out in the past, the FPS-4 compiler had several bugs. Here is one more, which I stumbled on while using that ancient compiler on the SWMS_3D code.

Compile and run this short program using FPS-4:

      program tst
      write(*,100) 1.0,9,7,-0.056,-0.053,0.1,0.8,-149.9
  100 format(f14.4,i3,i6,1x,3es11.3,3f7.0)
      end

Each run can give you different outputs and even error messages. I placed this code here to demonstrate not to take the results produced by FPS4 as correct!

siyuliben · ‎02-25-2021

Thank you very much for the answer. It is really helpful.

siyuliben · ‎02-25-2021

Hi Mecej4,

Can you please tell me how to use /Qzero and /Qsave to built the executable file?

Is what I type below in the command line correct?

ifort /Qzero /Qsave -o exefilename.exe *.FOR

The result ran by this executable file still gives me NaN output. Since you said you obtained the results with OneAPI that is similar to the FPS4-compiled EXE, I think my command is different than yours.

Thank you.

mecej4 · ‎02-25-2021

I don't see anything amiss, but how do you run the EXE that is produced by running the command that you showed? What is the current directory when you run that EXE?

siyuliben · ‎02-25-2021

Please see the files attached.

I created the EXE file in the source code folder and move the EXE file to the EXAMPLE folder.

'EXAMPLE_oneAPI_Qzero_Qsave' is the folder that has EXE file applied with /Qzero /Qsave by oneAPI.

'EXAMPLE_oneAPI_no_Qzero' is the folder that has EXE file without /Qzero /Qsave by oneAPI.

'EXAMPLE_FPS4' is the folder that has EXE file created by FPS4.

I double click each EXE files and compare the 'H.OUT‘ file in 'SWMS_3D.OUT' folder.

The input folders called 'SWMS_3D.IN' are the same.

Thank you.

mecej4 · ‎02-25-2021

When I compared the output from FPS4 and IFort, I was using only Example1 from the Zip file that I downloaded from the Progress site. My comparisons and statements regarding results obtained with /Qsave /Qzero were for that example only.

I'll need some more time to compare results for the "EXAMPLE_oneAPI_Qzero_Qsave" test case.

siyuliben · ‎02-26-2021

Thank you very much.

NaN came out from Example 2 and Example 4.

mecej4 · ‎02-26-2021

Siyuliben,

I found several other errors in the sources, most of them involving the usage of undefined variables, variables not saved when required, etc., and fixed them. After testing the program with several compilers, and different FPU options, I concluded that the program will not produce satisfactory results if 32-bit reals are used, except if the x87 FPU is used, and the compiled code keeps results in x87 registers as long as possible. The NaNs that we saw are caused by rapid overflow, underflow and divisions by zero when solving simultaneous linear equations thousands of times using the insufficient 32-bit floating point arithmetic.

To overcome this problem, I converted all real variables to double precision. The attached Zip file contains the modified source files. I have done some spot checks of the results after compiling with Intel Fortran and Gfortran. Please try the code on as many test problems as you wish and let me know what you find. If things come out OK, we could inform the maintainers of the program.

siyuliben · ‎02-26-2021

Hi Mecej4,

I really appreciate your help in resolving the problem.

I will try to run the program several times with different examples. I might need time to test the code. I will reply here If things come out OK. Thank you very much.

mecej4 · ‎03-04-2021

By chance, I tried the compiler option /arch:IA32 on the original code, and found that the resulting 32-bit EXE produces results that match the FPS4 results quite closely.

This compiler option is not available when developing 64-bit EXEs, or using the new IFX compiler, and this option may be removed in a future release of the compiler. Therefore, it is useful only for the present, as an easy way to obtain FPS4-like floating-point results without making the more extensive code changes to double precision (as in my modified sources, which I attached to my earlier post in this thread).

siyuliben · ‎03-04-2021

Thank you very much.

I tried your modified source code using several examples, and all of them have FPS4-like floating-point results.

I also used /arch:IA32, and it worked as well.

I really appreciate your help.

I just have a new question.

Do you know any program or website that can convert Fortran 77 to Fortran 90? I need it for other source code.

I used https://fortran.uk/plusfortonline.php before. But it is not working recently.

mecej4 · ‎03-04-2021

I have further narrowed down the part of the code where changing a few local scalar variables from single to double precision reals is sufficient to fix the problem.

In Subroutine RESET, source file WATFLOW3.FOR, add the following declaration:

      double precision det,ve,QN,caxx,cayy,cazz,caxy,caxz,cayz,
     + Amul,Bmul,Fmul,VolR,conE,BetaE,SinkE,hNewE

The SWMS_3D code turned out to be a noteworthy example of a widely distributed engineering analysis code in Fortran 77 that fails when the compiler generates SSE2 instructions rather than X87 instructions. Furthermore, finding a fix for the problem was a non-trivial task.

---

I do not know of any Fortran 77 to Fortran 90 converter that is self-contained, but there are tools (for example, Metcalf's convert.f90) to do parts of the conversion, after which one has to complete the conversion manually.

mecej4 · ‎03-05-2021

So far, the discussion in this thread has concerned the failure of the USGS SWMS_3D software when applied to certain problems for which single-precision SSE2 computations turn out to be inadequate. The telltale symptom was the printout of NaNs in place of reasonable numbers (such as the1.499 for Example.2 below), which is bound to catch the attention of the user, as attested to by this very thread.

There is an older sibling of SWMS_3D, called SWMS_2D, which also models transport of water and solute in porous media. Having found a cure for the ailments of SWMS_3D, I attempted to perform the same service to SWMS_2D, and my findings are more startling. With the 2-D model simulations, there were no NaNs to serve as warnings. Instead, the results were simply off by as much as 17 percent. Instead of the correct 1.499 for Example.2, one compiler gave 1.241, and another gave 1.566, neither of which may strike the user as incorrect.

Again, there is a simple (but time-consuming to find) fix: in file WATFLOW2.F, in Subroutine RESET, promote a few variables to double precision:

      double precision ae,qn,amul,bmul,fmul,xmul,cone,Bi,Ci

Intel Fortran still provides support for hunting down and fixing such problems, with /arch:IA32, but this option is probably not going to survive after a couple of years. My concern is that this difficulty with inadequate precision of intermediate results with SSE2 vs x87 is going to create headaches for future users of older simulation codes when that option is no longer available.

Comments, please!

Steve_Lionel · ‎03-05-2021

@siyuliben : "Do you know any program or website that can convert Fortran 77 to Fortran 90? I need it for other source code."

No conversion is needed - with few exceptions, valid FORTRAN 77 code is valid Fortran 90 code.

Were you instead asking about converting fixed-form source to free-form?

siyuliben · ‎02-24-2021

Hi Steve,

Thank you very much for your answer. I really appreciate it.

I understand this is a very complex program and hard to understand. I built the source code by Intel Fortran Compiler called 'SWMS_3D_IFC'. After running this program, the output called 'H.OUT' in folder 'SWMS_3D.out' gives NaN after 'Time *** 151.0000 ***' (this 'H.OUT' with NaN is attached in folder 'SWMS_3D.out').

If I use PowerStation to build the program ('SWMS_3D_PS'), and run the program under 'EXAMPLE folder', 'H.OUT' will not give NaN.

I also screenshot the error when I run the program.

I understand this is a very complex program. I just hope I can get some idea about why this error occurs, so maybe I can fix the source code.

Thank you very much.