I am confused with what should be the correct behavior of a Fortran code when one passes overlapping pointer arrays as actual arguments to a subroutine that takes non-pointer arrays as dummy arguments. Please, take a look at the code below:
! my_axpy.f90 module my_axpy_m implicit none contains subroutine my_axpy(N, alpha, x, y) integer, intent(in) :: N double precision, intent(in) :: alpha ! Incorrect results happens if x/y are either assumed-shape or ! assumed-size arrays, but not if they are pointer dummy arguments. double precision, intent(inout) :: x(:) double precision, intent(inout) :: y(:) integer :: ii do ii = 1, N x(ii) = alpha*x(ii) + y(ii) enddo end subroutine my_axpy end module my_axpy_m
! main.f90 program main use my_axpy_m implicit none ! Need N>=4 to trigger the wrong result. integer, parameter :: N=4, OFFSET=1, M=N+OFFSET double precision, pointer :: x(:), y(:) integer :: ii allocate(y(M)) x => y(OFFSET+1:M) do ii = 1, M y(ii) = ii enddo print *, 'Input x and y:' do ii = 1, N print *, x(ii), y(ii) enddo call my_axpy(N, 1d0, x, y) print * print *, 'Output x:' do ii = 1, N print *, x(ii) enddo end program main
! Makefile QOPT = -qopt-report1 -qopt-report-file=stderr -qopt-report-phase=vec # Correct result F90 = ifort -O3 $(QOPT) # Incorrect result on a Intel(R) Xeon(R) CPU E5-2698 v3 @ 2.30GHz #F90 = ifort -xHost $(QOPT) default: main clean: rm -f *.o *.mod main .PHONY: clean main.o: my_axpy.o %.o: %.f90 $(F90) -c $< main: main.o my_axpy.o $(F90) -o main $^
Depending on the optimization level (see the Makefile) I get different results. With ifort -O3 (or simply ifort), I get
But with ifort -xHost, I get
The first result is clearly taking into account that the arrays actually point to overlapping memory regions, while the second one does not (or vectorizes and does not capture the overlap)
At any rate, what is the correct behavior of a Fortran program in this case? I quickly went over the Fortran 2003 standard, but couldn't find anything concrete about a case like this. I am missing something?
Since the dummy arguments in the `my_axpy` subroutine are non-pointer arrays, I assume that the correct behavior is to vectorize and assume that x and y point to non-overlapping memory regions inside the subroutine. Does that mean that the compiler should make a copy of the arrays x and y before calling the subroutine, which is not happening with ifort -xHost?
I see a similar behaviour with gfortran. However, nagfor seems to print out in all cases 3, 6, 10, 15. It looks like that the compiler is optimizing
in the case -xHost for ifort or -O4/-Ofast for gfortran the do loop inside the subroutine and then not taking into account the memory addresses of the x and y actual arguments. You can avoid this by giving the two dummy arguments the volatile attribute, then both ifort and gfortran do not do their optimizations, probably taking into account the possibility that the actual arguments might have changed elsewhere (e.g. by means of pointer assignment).
Your program is not legal Fortran because of the aliasing. If you add the TARGET attribute to the dummy arguments, then it will be ok.
Section 126.96.36.199 of the F2008 standard says:
Action that affects the value of the entity or any subobject of it shall be taken only through the dummy argument unless
(a) the dummy argument has the POINTER attribute or
(b) the dummy argument has the TARGET attribute, the dummy argument does not have INTENT (IN), the dummy argument is a scalar object or an assumed-shape array without the CONTIGUOUS attribute, and the actual argument is a target other than an array section with a vector subscript.
If the value of the entity or any subobject of it is affected through the dummy argument, then at any time during the invocation and execution of the procedure, either before or after the definition, it may be referenced only through that dummy argument unless
(a) the dummy argument has the POINTER attribute or
(b) the dummy argument has the TARGET attribute, the dummy argument does not have INTENT (IN), the dummy argument is a scalar object or an assumed-shape array without the CONTIGUOUS attribute,
thanks for your answers.
In, particular, Steve, thanks for pointing out to the standard. I also see that Note 12.34 is very clear about this:
If there is a partial or complete overlap between the effective arguments of two different dummy arguments of the same procedure and the dummy arguments have neither the POINTER nor TARGET attribute, the overlapped portions shall not be defined, redefined, or become undefined during the execution of the procedure.
However, even after I added the TARGET attribute to the dummy arguments x and y in the my_axpy subroutine, I still get the same inconsistent results. I only get the same results (which takes aliasing into account) when I add the POINTER attribute. Shouldn't TARGET be enough?
By the way, I always get the correct answering (i.e., which considers aliasing) when I use gfortran or cray compiler. I am surprised that they give the "right" answer even without the TARGET attribute..
FELIPE L. wrote:
I tried intel 188.8.131.52, 184.108.40.206, and 220.127.116.11, and they all give the wrong results with -xHost and without the POINTER attribute.
If I add -assume dummy_aliases, I get the right result – though I get the right result even without the POINTER or TARGET attributes.
That should not be a cause for much surprise. If you have code that satisfies the anti-aliasing rules, optimization options should have no effect on the functioning of the code and you should get the correct results.
If the code does not satisfy the anti-aliasing rules, you may see errors when optimization features are used that are of such a nature that the presence of aliasing may produce incorrect results. Setting the optimization level low or specifying the POINTER attribute may inhibit the application of unsafe optimizations.
As long as your code does not satisfy the anti-aliasing rules, you should not assume that a certain set of optimization options will continue to give you correct results just because the results were correct at one time with the same options.