Array Bounds Checking Error

jimdempseyatthecove · ‎05-31-2012

w_fcompxe_2011.10.35
Windows 7 x64
VS 2010
(comments below jpg)

Note, Debugger shows second subscript of T with lower bound of 0.
This bound is correct (all bounds are correct)
The generated bounds checking code is in error (see black screen at bottom)

I believe this, or a similar problem, The above should be a simple reproducer.

[fortran]module MOD_AVX ! SSE two-up double vector type TypeXMM SEQUENCE real(8) :: v(0:1) end type TypeXMM ! SSE two-up double vector triplet type TypeXMMxyz SEQUENCE real(8) :: vX(0:1) real(8) :: vY(0:1) real(8) :: vZ(0:1) end type TypeXMMxyz ! AVX four-up double vector type TypeYMM SEQUENCE real(8) :: v(0:3) end type TypeYMM ! AVX four-up double vector triplet type TypeYMMxyz SEQUENCE real(8) :: vX(0:3) real(8) :: vY(0:3) real(8) :: vZ(0:3) end type TypeYMMxyz end module MOD_AVX ... subroutine CopyToYMM_2D(f, t, s) use MOD_AVX real, pointer :: f(:,:) type(TypeYMM), target :: t(:,:) integer :: s integer :: i,j real, pointer :: slice(:,:) do j=LBOUND(f, DIM=2),UBOUND(f, DIM=2) do i=LBOUND(f, DIM=1),UBOUND(f, DIM=1) t(i,j).v(s) = f(i,j) end do end do slice(LBOUND(f, DIM=1):UBOUND(f, DIM=1), LBOUND(f, DIM=2):UBOUND(f, DIM=2)) => t(LBOUND(f, DIM=1), LBOUND(f, DIM=2))%v(s::4) deallocate(f) f => slice end subroutine CopyToYMM_2D [/fortran]
call with arrays (via pointer) allocated to

real(8), pointer :: F(:,:)
type(TypeYMM), pointer :: T(:,:)
...
allocate(F(1:3,0:10+1))
allocate(T(1:3,0:10+1))
call CopyToYMM_2D(F, T, 0)

You will have to add an interface for CopyToYMM_2D

Jim Dempsey

IanH · ‎05-31-2012

t is an assumed shape dummy argument (default lower bound of 1), not deferred shape (bounds are those of the associated actual argument).

jimdempseyatthecove · ‎05-31-2012

Additional info...

In Debug build, with bounds checking disabledfor the above file, the deallocate(f) would (after some deallocations) report corrupted heap.

As to if this problem is related to the bounds check error or not I cannot say.

Note, by NOT deallocating the memory, the application runs correctly.

An explination of what is going on with the code may be in order.

The application is somewhat like a finite element simulation program. The simulation is of tethers and objects. Tethers can be viewed as a 2D object in 3D space and represented as a collection of segments connecting beads. Each segment has many properties and states as do the beads.

There are ~40 arrays, some with rank-2 some rank-1 with a mix ofXYZ vectors and scalars.

The current code is multi-threaded (OpenMP) and distributes the workload on tether by tether basis. Load distribution is relatively good.

At issue is, many of the loops will vectorize to some extent. However, in the case of the XYZ vectors (or properly said, vector of XYZ vectors) only part of the calculation is vectorizable, the remainder is scalar and cross lane w/rt SSE/AVX. My system has AVX.

To improve (maximize) vectorization on AVX, I am reallocating the ~40 arrays, mapped in a manner such that tethers are now 4-up (filling an AVX small vector)...

!!! yet, because Fortran has pointers with LowerBound:UpperBound:Stride I can remap the former allocations to the newer allocation format.

*** with all the old code remaining untouched ***

The solution has over 700 files (~700,000 lines of code).

Now then. the critical compute loops can have each thread working on 4 tethers at a time with a significantly higher degree of vectorization.

My conversion is not complete so I do not have performance information. I hope to have converswion done in a few weeks and then do a write-up.

Jim Dempsey

jimdempseyatthecove · ‎05-31-2012

The dummy argument is declared with t(:) or t(:,:) and has interface declarations for the caller. Therefore they are deferred shape.

Further LBOUND(t, DIM=2) returns the correct (external) lower bound of 0.
And the debugger is able to obtain the correct bounds

Jim

IanH · ‎05-31-2012

Deferred shape requires the pointer or allocatable attribute. The dummy argument t has neither. See 5.3.8.3 and 5.3.8.4 of F2008 (similar words in previous standards).

If you are seeing an lbound of zero inside the procedure (after you remove code that's potentially trampling outside the array bounds - that could be obliterating part of the array descriptor for the dummy argument and confusing the issue) then that's a compiler bug.

If the debugger is not complaining about a lower bound of zero, then that's a debugger bug (potentially exacerbated by t appearing in multiple scopes). The debugger has lead me up the garden path a few times previously in similar situations - to the extent that it is no longer on my Christmas card list.

jimdempseyatthecove · ‎06-01-2012

IanH,

>>Deferred shape requires the pointer or allocatable attribute

Thank you, this was my error.

>>If the debugger is not complaining about a lower bound of zero, then that's a debugger bug (potentially exacerbated by t appearing in multiple scopes).

Inserting:

write(*,*) LBOUND(f, DIM=2), LBOUND(t, DIM=2)

shows:
0 1

Clearly the compiler is generating and runtime is seeing 0-base converted to 1-based.

The pointer f(:,:) is receiving the pointer to the 0-based (DIM=2)array descriptor
The dummy (target) t(:,:) is receiving a new, 1-based (DIM=2)array descriptor.

*** The debugger is showing the 0-based array descriptor for t(:,:) ???

By changing:
type(TypeYMM), target :: t(:,:)
to
type(TypeYMM), pointer :: t(:,:)

Now I get the deferred shape (0-based DIM=2) value an now t is consistent with the debugger
*** and more importandly, I am not addressing outside of bounds

(writing to t(1,0) before, apparently overwrote "somehing" and occasionally that something was an array descriptor that some time later showed up as corrupted heap.

Now IanH, I have a question.

You will note at the end of the subroutine I have

[fortran]slice(LBOUND(f, DIM=1):UBOUND(f, DIM=1), LBOUND(f, DIM=2):UBOUND(f, DIM=2)) & & => t(LBOUND(f, DIM=1), LBOUND(f, DIM=2))%v(s::4) deallocate(f) f => slice [/fortran]

This is where the magic comes in of remapping the old array pointer with stride(s)1, to the newly remapped format requiring stride of 4 on DIM=2. also note that the origin pointer indexes by slice number v(s::4).

Now then to the question I have.

As long as the old code uses the new array descriptor (as pointed to by the converted pointer) the code works fine. Also, in the few places where f(:,n) is used I see in Debug build "Array temporary created". This is good, as it makes the code work and indicates locations where I have yet to convert the code. The question now becomes:

In a new subroutine, if I use something equivalent to

real(8) :: foo(:,:)

And the caller passes in something like the f(:,:)

IOW the base is converted from 0-base to 1-base, what happens to the stride?
Will this require an array temporary and conversion to stride-1?
I suppose I could mock-up a test. I am not concerned about new code I write, it is the old code that concerns me.

BTW, thanks for standing up to me to show me the errors of my way. I am wrong from time to time.

Prior to this, I have hardly used the stride feature of Fortran. This feature is quite neet, and in particular for me for this conversion, it eliminates having to alter 10's-100's of source files due to the re-arrangement of the data. I will have to add a few 10's of same functionality routines that use the newAVX (or SSE) packed data.

I think you might call this polymorphic data placement (or some better phraseology).

Jim Dempsey

IanH · ‎06-02-2012

Guessing, but I'd expect that the relevant bits of the descriptor for the actual argument coming in with lower bound zero and stride four would be copied to a descriptor for the dummy argument with lower bound one and stride (still) four. But you'd be more familiar with workings at that level than me.