New Version of ifort:
$ ifort -v
ifort version 18.0.3
$ ifort -v
ifort version 14.0.3
I'm running a numerical model of tsunamis. The same computer with the same setting of the run and the old compiler runs correctly, but with the new compiler it generates an error in a modeling time:
T1= 17.44000 (min)
PROGRAM is terminated
instability occurs in 1st grid at (j1,k1)=( 1023 2 )
The ifort FFLAGS are:
FFLAGS = -O2 -mavx -ipo -pad -unroll -w -WB -shared-intel -mcmodel=large
Any idea ?
PD: I have compiled the code in several different ways and it always becomes unstable at the same time ..
Does setting optimization level -O0 run to completion with expected results?
If so, it is indicative of a floating point optimization where some bits of precision are traded off for improved speed. Then try the -O2 again, but also include one or more floating point options that favor precision over performance.
Rodrigo, the error message is from the user code, not from the compiler. Therefore, your question needs to be directed to the persons who wrote the code, and it may be useful to look up any background write-ups on the algorithms used in the code.
It is even possible that the slightly different results from one version of the compiler are being misinterpreted as an indication of numerical instability. Perhaps, a variable is being used before it has been initialized with a value? A lot depends on what your code is doing, and what criteria it used to decide that it ran into an "instability". To show why providing context is important, here is a test program that generates the same message as yours without doing any calculation at all:
program unstable implicit none print *,'T1= 17.44000 (min)' print * print *,'PROGRAM is terminated' print *,' instability occurs in 1st grid at (j1,k1)=( 1023 2 )' end program
It's usual in such applications to require care with parentheses to specify the association of operations, and to set -standard-semantics or -protect-parens so as to require the compiler to observe them in accordance with the standard. In an expression such as x(i+1)-x(i-1) + y(i+1)-y(i-1) , in the usual case, you want (x(i+1)-x(i-1))+(y(i+1)-y(i-1)) so as to avoid partial cancellation of accuracy. Even if the magnitudes of x and y are the same, re-association of operations would lose 1 bit of accuracy. In finite element analysis, such situations are common but not so obvious to a non-expert (meaning knowing something about both FE and numerical analysis). Setting -no-ftz would avoid loss of accuracy when operating with small magnitude numbers; it would help even if the preceding recommendations are violated.
Some people who use C++ with code generated by a pre-processor want the compiler to perform algebraic simplification of expressions such as (a+b)-(b+c), at least when options are set which permit simd optimization of sum reductions like SUM(a). For that reason, gcc -ffast-math includes such transformations, while gfortran does not, although -ffast-math enables SUM optimization. Technically speaking, the language standards don't permit the latter optimization (which will be disabled by -fp-model source), although those are traditional in Fortran. Further, there are totally safe optimizations such as replacing (a/2.+b/2.) by (a+b)*.5 (which are technically not permitted in C or C++) .Intel compiler writers decided not to perform these without aggressive options set, so you need to write your programs so that those optimizations aren't included. You might find it useful to review the optimizations which are permitted specifically by the Fortran standard but not by the C or C++ standards which have the potential of losing accuracy for numbers in the range ABS(x) < TINY(x)/EPSILON(x) . Intel compilers don't distinguish those optimizations from those which shouldn't be included as defaults.
gfortran is more careful than ifort about which optimizations are associated with defaults and with -ffast-math, except that -ffast-math implies -fcx-limited-range, a simplification which Intel compilers perform only when -fp-model fast=2 or -complex-limited-range are set. There are situations where -no-ftz could be sufficient even with the aggressive optimization, or might be needed regardless of how much care is taken. gfortran has no option like -fast-transcendentals, so you might need to try -no-fast-transcendentals to test for situations where the fast-transcendentals loses a little accuracy (typically with exp() and **).
CPUs which support AVX have hardware in which addition and subtraction operations are efficient, even for small numbers, regardless of ftz setting. Intel had to do this in order to perform well with gfortran and gcc. It makes sense to set -no-ftz along with -mavx.
Sorry my spiel is so long. It could be a little shorter if ifort would be more consistent with gfortran. Still, I think these subjects need more discussion than simply referring you to Dr. Martyn Corden's excellent paper, which I would recommend you read next: https://software.intel.com/en-us/articles/consistency-of-floating-point-results-using-the-intel-comp...
Thanks for your comments...
I compile the code with -O0 option (instead of -O2), and the model ran normally,
The instability comes from a numerical criteria related to Courant-Friedrichs-Lewy (CFL) condition for convergence while solving partial differential equations, and occurs after 89 time steps. There isn´t any issue related to vectorization flag. I ran the model with -O1 flags, and the instability occurs in the same time, at the same grid point. Something similar happens when I use an atmospheric model (WRF: https://www.mmm.ucar.edu/weather-research-and-forecasting-model). The Tsunami model is "NeoWave" from University of Hawaii.
It is looks like not a code issue for me...
Rodrigo, the only thing you can do is to check the numerical output of this critical routine between the two versions and see in which step they start to deviate.
The additional information that Rodrigo gave in #6 reminds me that in such circumstances one could try one of the compiler flags (such as -no-ansi-alias) that tell the compiler to assume that two or more subprogram arguments may be aliased.