Solved: intrinsic function "ptr1(:,:,:) = ptr2(:,:,:)" segfaults -- ifort bug - Page 2

Jens_Henrik_Goebbert · ‎10-07-2008

Short Description:

=> Loop over i,j,k ("ptr1(i,j,k) = ptr2(i,j,k)") works fine, but "ptr1(:,:,:) = ptr2(:,:,:)" segfaults

The attached program segfaults if it uses the intrinsic function ptr1(:,:,:)=ptr2(:,:,:) to copy data of one pointer-array to another pointer-array. The number of array-elements must exceed a certain number - in my case it segfaults with 129*128*256 but works fine with 65*64*128.

I cannot see any Fortran-Style-Violation, that's why I am sure it is a bug of the Intel Fortran Compiler 10.1.011

Long Description:

The attached program is doing the following:

allocate one big chunk of memory - "allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)"
associating two 3d-pointer with parts of allocated memory:
ptr1(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),1)
ptr2(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),2)
fill ptr1 and ptr2 with some data (no segfault)
ptr1(:,:,:) = 1
ptr2(:,:,:) = 2
loop over i,j,k (no segfault)
ptr1(i,j,k) = ptr2(i,j,k)")
use intrinsic function to copy data (SEGFAULT)
ptr1(:,:,:) = ptr2(:,:,:)

Observances:

It seems not to be a direct problem of the size of memory in bytes, but of the number of elements. If I use complex type instead of real it segfaults occurrs with the same number of elements even though complex takes double memory.

Assumption:

I noticed that memory-address of mem3d and the ptrs are marked with "Sparse" if I debug using TotalView. I wonder if the intrinsic function accesses memory, which is not yet allocated because of the compressed data-format (Sparse) in memory.

System:

Dell PowerEdge 1950, 16 GByte RAM, 4x Intel Xeon CPU 5160 @ 3.00GHz
Fedora 6 (2.6.22.9-61.fc6 #1 SMP Thu Sep 27 18:07:59 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux)
Intel Fortran Compiler 10.1.011 and 9.1 tested
no optimisation -g -check all -traceback -fp-stack-check
mem(129,128,256) => segfault
mem( 65, 64,128) => NO segfault

==============================================================================

Source:

program ptrbug

implicit none

integer :: mem_id, msize(3)
parameter(msize = (/129,128,256/)) !=> segfault
! parameter(msize = (/65,64,128/)) !=> no segfault

real(kind=8),dimension(:,:,:,:), allocatable, target :: mem3d
real(kind=8), dimension(:,:,:), pointer :: ptr1, ptr2

! complex(kind=8),dimension(:,:,:,:), allocatable, save, target :: mem3d
! complex(kind=8), dimension(:,:,:), pointer :: ptr1, ptr2

integer :: i,j,k,ierr
ierr = 0

! ---------------------------
! allocate dynamic memory and set pointer
! ---------------------------
write(*,*) 'allocate memory'

allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)
if(ierr.ne.0) write(*,*) 'cannot allocate mem for mem3d'
mem3d(:,:,:,:) = 0.d0

do mem_id=1, 2
call set_ptr( mem3d(1,1,1,mem_id), mem_id)
enddo

! ---------------------------
! print infos
! ---------------------------
write(*,*) 'ptr1-dims: '
write(*,*) ' lbound: ', lbound(ptr1,1), lbound(ptr1,2), lbound(ptr1,3)
write(*,*) ' ubound: ', ubound(ptr1,1), ubound(ptr1,2), ubound(ptr1,3)
write(*,*) ' asize: ', size(ptr1,1), size(ptr1,2), size(ptr1,3)

write(*,*) 'ptr2-dims: '
write(*,*) ' lbound: ', lbound(ptr2,1), lbound(ptr2,2), lbound(ptr2,3)
write(*,*) ' ubound: ', ubound(ptr2,1), ubound(ptr2,2), ubound(ptr2,3)
write(*,*) ' asize: ', size(ptr2,1), size(ptr2,2), size(ptr2,3)

! ---------------------------
! tests
! ---------------------------
write(*,*) 'test 1 (fill ptr1)'
ptr1(:,:,:) = 1

write(*,*) 'test 2 (fill ptr2)'
ptr2(:,:,:) = 2

write(*,*) 'test 3 (copy data using do-loops)'
do k=1,msize(3)
do j=1,msize(2)
do i=1,msize(1)
ptr1(i,j,k) = ptr2(i,j,k)
enddo
enddo
enddo

write(*,*) 'test 4 (copy data using intrinsic function)'
ptr1(:,:,:) = ptr2(:,:,:)

stop

contains

!========================================
!
! assign pointer to allocated dynamic 3d memory
!
!========================================
subroutine set_ptr(ref_mem3d, mem_id)
implicit none

! function args
real(kind=8),dimension( msize(1),msize(2),msize(3)), target :: ref_mem3d
! complex(kind=8), dimension( msize(1),msize(2),msize(3)), target :: ref_mem3d
integer, intent(in) :: mem_id

if(mem_id .eq. 1) then
ptr1 => ref_mem3d
else if(mem_id .eq. 2) then
ptr2 => ref_mem3d
endif

end subroutine set_ptr

end

Kevin_D_Intel · ‎10-07-2008

The compiler uses stack temporaries to accomplish the assignment of the form:

ptr1(:,:,:) = ptr2(:,:,:)

The program runs successfully when compiled with: -heap-arrays

Or if one increases the shell stack limit via:

For Bash/sh/ksh use: ulimit -s unlimited

For Csh use: limit stacksize unlimited

I do not believe this represents a bug with the compiler, but if it does I will post again.

View solution in original post

Steven_L_Intel1 · ‎10-14-2008

I will agree that the language's lack of a way to declare an array of arrays (of any kind) can create awkwardness, though for some code the new F2003 ASSOCIATE construct may provide a way to "neaten up" the source code. I don't see that you have proposed anything that will help there.

An array that can be contiguous or null is spelled ALLOCATABLE. I have not seen anything in this thread that requires POINTER, nor so I see value in enhancing POINTER along these lines. You would have to create a new keyword and rules for when it was permissible or not permissible to pass a "contiguous" pointer to an ordinary pointer.

jimdempseyatthecove · ‎10-14-2008

The original post was using POINTERS

>>

The attached program is doing the following:

allocate one big chunk of memory - "allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)"
associating two 3d-pointer with parts of allocated memory:
ptr1(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),1)
ptr2(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),2)
fill ptr1 and ptr2 with some data (no segfault)
ptr1(:,:,:) = 1
ptr2(:,:,:) = 2
loop over i,j,k (no segfault)
ptr1(i,j,k) = ptr2(i,j,k)")
use intrinsic function to copy data (SEGFAULT)
ptr1(:,:,:) = ptr2(:,:,:)

<<

Personally, I think the user should "bite the bullet" and make a user defined type containing an allocatable array (with TARGET).Then point his pointers at that/those.

Jim Dempsey

Steven_L_Intel1 · ‎10-15-2008

Yes, I can see that the ptr1(:,:,:) = ptr2(:,:,:) assignment would create a temp which is causing the segfault or stack overflow. It may or may not be better if written ptr1=ptr2. A complex run-time check for lack of overlap (taking strides into account) would be needed to avoid the temp.

Steve

jimdempseyatthecove · ‎10-15-2008

Quoting - Steve Lionel (Intel)

Yes, I can see that the ptr1(:,:,:) = ptr2(:,:,:) assignment would create a temp which is causing the segfault or stack overflow. It may or may not be better if written ptr1=ptr2. A complex run-time check for lack of overlap (taking strides into account) would be needed to avoid the temp.

Steve

Steve,

And my argument is the complex runtime check can be eliminated if there were an attribute or attributesthe programmer can place on the pointer that specifies a) the memory pointed to will always be contiguous and optionallyb) the memory pointed to will never overlap. An alternate way would be to add a compiler directive that accomplishes the same thing as the proposed pointer attributes either placed at the pointer declaration or placed at the appropriate code statement. You could also add a runtime diagnostic option similar to index out of bounds tests that asserts the requirements of the attribute upon execution of => pointer assignments.

I am mearly makingsuggestions for optimization techniques that gives the IVF compiler an advantage over the competitors products.

Jim

Steven_L_Intel1 · ‎10-15-2008

Well, there is a -fno-alias switch which might help. I see your point.

TimP · ‎10-15-2008

Quoting - Steve Lionel (Intel)

Well, there is a -fno-alias switch which might help. I see your point.

I thought that switch was on by default in Fortran, and that it asserts that code complies with the Fortran standard where subroutine arguments don't alias each other.

Steven_L_Intel1 · ‎10-15-2008

Quoting - tim18

I thought that switch was on by default in Fortran, and that it asserts that code complies with the Fortran standard where subroutine arguments don't alias each other.

No, it isn't on by default. The one you're thinking of is -assume dummy_aliases which is indeed off by default.

-fno-alias has to do with pointers. It's really a C-ism, but I have seen it have some effect on Fortran.

Jens_Henrik_Goebbert · ‎10-30-2008

Quoting - Kevin Davis (Intel)

I directed the test case and this discussion to our High-level optimization development team for their analysis and opinion on removing the stack temp creation and use. I will follow-up as I learn more. (Internal ref. CQ-50310)

Will there be any more comments from the high-level optimization development team on this. There were a lot of comments on this topic ... but all of them are just work-arounds and no real fix.

Kevin_D_Intel · ‎10-30-2008

They have not provided any feedback at this time. I will pass along updates as I learn them.

I am not sure I would classify this as a bug with the Intel compiler. Yes, some discussions suggest ways the compiler might avoid using array temps but nothing suggests the compiler was wrong to do so given the potential for overlap and the difficulty in determining such when using POINTER.

While you note that xlf works, we do not know if it also uses array temps for the context in question. It, like other compilers, may default to placing array temps on the heap which avoids exhausting the stack and the ensuing segmentation fault, similar to how ifort avoids this when the -heap-arrays option is used.

Again, I will pass along any updates.