Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29285 Discussions

Different returning values from the same subroutine

Sung_H_
Beginner
1,047 Views

Hello all,

I found some abnormal behaviors of my LBM program. The snippet is as follows, and i paste the code here:http://pastebin.com/mD3TFVhc

[fortran]

    do j = 1,ly
        do i = 1,lx
            call collision(i,j,omega,eul_f,p, f,     fnew)
            call collision(i,j,omega,eul_f,p, n_hlp, node)
        end do
    end do

[/fortran]

Here, four arrays 'f,fnew,n_hlp,node' have the same dimensions. 'f' and 'h_hlp' are initialized to be the same and 'fnew' and 'node' store the returning values. If compiling the code with the following command

[bash]

ifort snip.f90

./a.out

[/bash]

the data file shows difference between 'fnew' and 'n_hlp', about the order of 1e-17. But uncommenting the four lines 87-90 or adding '-g' to the compiler option will results the same 'fnew' and 'n_hlp'. Is this reasonable?

Thanks in advance.

Sung Hui

0 Kudos
5 Replies
Casey
Beginner
1,047 Views

Try using the compile flag "-fp-model source" and see if that difference goes away. In my experience small magnitude error is due to optimizing floating point math and vecorization. Whether that is reasonable depends on your requirements. 

0 Kudos
TimP
Honored Contributor III
1,047 Views

As Casey pointed out, you can disable optimizations which are expected to cause variations from IEEE standard and with vector alignment.  -g causes a default to -O0 which will disable those and many other optimizations.

In your code, the frequent division of vector by scalar would be optimized under the default -fp-model fast by inverting the divisor outside the loop and multiplying within the loop. With ifort, this will also prevent totally safe optimizations such as replacing /2.0 by *0.5.

Several of your expressions could be grouped to produce more consistent accuracy; normally, lack of care can be excused by the use of double precision.  Activating the WRITE statements in that loop may have a similar effect to compiling at -O0 or even putting !dir$ novector ahead of the loop.

0 Kudos
TimP
Honored Contributor III
1,047 Views

For example, (ef -uf) would be more accurately calculated as

efmuf = (e(k,1)- u(1)) * eul_f(i,j,1) + (e(k,2)- u(2)) * eul_f(i,j,2)

an alternative evaluation explicitly encouraged by Fortran rules, but unlikely to happen consistently in your context.

Then (efmuf + (eu/Cs2)*ef)/Cs2 also encouraged by Fortran rules would be a small further improvement in accuracy, as well as eliminating one of the divisions in a better way than the compiler is likely to try without -fp-model source.

eliminating those +1 terms, e.g. (w(k)-.5*w(k)*omega) could be another accuracy improvement, although not permitted as a standard alternate.

then (w(k)*rho + (w(k)*rho)* ((eu/Cs2) +.5*(eu/Cs2)**2)) to force the compiler to repeat the same expressions (so not so many repeated divides required) as well as improving accuracy.

These could be more important to accuracy than forcing IEEE divide (-prec-div, implied by -fp-model source).

In your case, I wonder if you are getting inconsistent partial in-lining of your collision function, possibly avoidable by -fno-inline-functions.  What did -opt-report show?

0 Kudos
Sung_H_
Beginner
1,047 Views

opt-report is here: http://pastebin.com/z7WZ1pmC, i'm not quite understand it. Thanks TimP for your experienced skills, i will change my codes according to your advice.

0 Kudos
TimP
Honored Contributor III
1,047 Views

I guess the report shows complete in-lining of the internal subroutine, which makes it difficult to follow, as well as opening the possibility that the 2 in-lined copies aren't identical.

Normally, in-line expansion of internal subroutines is desirable, but it seems not when investigating your questions.

With in-lining, there are no opt-report references to the original source line numbers in the internal procedures.  As I mentioned, I would try -fno-inline-functions to try to avoid this with ifort (or gfortran).

0 Kudos
Reply