- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Short Description:
=> Loop over i,j,k ("ptr1(i,j,k) = ptr2(i,j,k)") works fine, but "ptr1(:,:,:) = ptr2(:,:,:)" segfaults
The attached program segfaults if it uses the intrinsic function ptr1(:,:,:)=ptr2(:,:,:) to copy data of one pointer-array to another pointer-array. The number of array-elements must exceed a certain number - in my case it segfaults with 129*128*256 but works fine with 65*64*128.
I cannot see any Fortran-Style-Violation, that's why I am sure it is a bug of the Intel Fortran Compiler 10.1.011
Long Description:
The attached program is doing the following:
- allocate one big chunk of memory - "allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)"
- associating two 3d-pointer with parts of allocated memory:
ptr1(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),1)
ptr2(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),2) - fill ptr1 and ptr2 with some data (no segfault)
ptr1(:,:,:) = 1
ptr2(:,:,:) = 2 - loop over i,j,k (no segfault)
ptr1(i,j,k) = ptr2(i,j,k)") - use intrinsic function to copy data (SEGFAULT)
ptr1(:,:,:) = ptr2(:,:,:)
Observances:
It seems not to be a direct problem of the size of memory in bytes, but of the number of elements. If I use complex type instead of real it segfaults occurrs with the same number of elements even though complex takes double memory.
Assumption:
I noticed that memory-address of mem3d and the ptrs are marked with "Sparse" if I debug using TotalView. I wonder if the intrinsic function accesses memory, which is not yet allocated because of the compressed data-format (Sparse) in memory.
System:
- Dell PowerEdge 1950, 16 GByte RAM, 4x Intel Xeon CPU 5160 @ 3.00GHz
- Fedora 6 (2.6.22.9-61.fc6 #1 SMP Thu Sep 27 18:07:59 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux)
- Intel Fortran Compiler 10.1.011 and 9.1 tested
- no optimisation -g -check all -traceback -fp-stack-check
- mem(129,128,256) => segfault
mem( 65, 64,128) => NO segfault
==============================================================================
Source:
program ptrbug
implicit none
integer :: mem_id, msize(3)
parameter(msize = (/129,128,256/)) !=> segfault
! parameter(msize = (/65,64,128/)) !=> no segfault
real(kind=8),dimension(:,:,:,:), allocatable, target :: mem3d
real(kind=8), dimension(:,:,:), pointer :: ptr1, ptr2
! complex(kind=8),dimension(:,:,:,:), allocatable, save, target :: mem3d
! complex(kind=8), dimension(:,:,:), pointer :: ptr1, ptr2
integer :: i,j,k,ierr
ierr = 0
! ---------------------------
! allocate dynamic memory and set pointer
! ---------------------------
write(*,*) 'allocate memory'
allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)
if(ierr.ne.0) write(*,*) 'cannot allocate mem for mem3d'
mem3d(:,:,:,:) = 0.d0
do mem_id=1, 2
call set_ptr( mem3d(1,1,1,mem_id), mem_id)
enddo
! ---------------------------
! print infos
! ---------------------------
write(*,*) 'ptr1-dims: '
write(*,*) ' lbound: ', lbound(ptr1,1), lbound(ptr1,2), lbound(ptr1,3)
write(*,*) ' ubound: ', ubound(ptr1,1), ubound(ptr1,2), ubound(ptr1,3)
write(*,*) ' asize: ', size(ptr1,1), size(ptr1,2), size(ptr1,3)
write(*,*) 'ptr2-dims: '
write(*,*) ' lbound: ', lbound(ptr2,1), lbound(ptr2,2), lbound(ptr2,3)
write(*,*) ' ubound: ', ubound(ptr2,1), ubound(ptr2,2), ubound(ptr2,3)
write(*,*) ' asize: ', size(ptr2,1), size(ptr2,2), size(ptr2,3)
! ---------------------------
! tests
! ---------------------------
write(*,*) 'test 1 (fill ptr1)'
ptr1(:,:,:) = 1
write(*,*) 'test 2 (fill ptr2)'
ptr2(:,:,:) = 2
write(*,*) 'test 3 (copy data using do-loops)'
do k=1,msize(3)
do j=1,msize(2)
do i=1,msize(1)
ptr1(i,j,k) = ptr2(i,j,k)
enddo
enddo
enddo
write(*,*) 'test 4 (copy data using intrinsic function)'
ptr1(:,:,:) = ptr2(:,:,:)
stop
contains
!========================================
!
! assign pointer to allocated dynamic 3d memory
!
!========================================
subroutine set_ptr(ref_mem3d, mem_id)
implicit none
! function args
real(kind=8),dimension( msize(1),msize(2),msize(3)), target :: ref_mem3d
! complex(kind=8), dimension( msize(1),msize(2),msize(3)), target :: ref_mem3d
integer, intent(in) :: mem_id
if(mem_id .eq. 1) then
ptr1 => ref_mem3d
else if(mem_id .eq. 2) then
ptr2 => ref_mem3d
endif
end subroutine set_ptr
end
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The compiler uses stack temporaries to accomplish the assignment of the form:
ptr1(:,:,:) = ptr2(:,:,:)
The program runs successfully when compiled with: -heap-arrays
Or if one increases the shell stack limit via:
For Bash/sh/ksh use: ulimit -s unlimited
For Csh use: limit stacksize unlimited
I do not believe this represents a bug with the compiler, but if it does I will post again.
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I will agree that the language's lack of a way to declare an array of arrays (of any kind) can create awkwardness, though for some code the new F2003 ASSOCIATE construct may provide a way to "neaten up" the source code. I don't see that you have proposed anything that will help there.
An array that can be contiguous or null is spelled ALLOCATABLE. I have not seen anything in this thread that requires POINTER, nor so I see value in enhancing POINTER along these lines. You would have to create a new keyword and rules for when it was permissible or not permissible to pass a "contiguous" pointer to an ordinary pointer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The original post was using POINTERS
>>
The attached program is doing the following:
- allocate one big chunk of memory - "allocate(mem3d(msize(1),msize(2),msize(3),2), stat=ierr)"
- associating two 3d-pointer with parts of allocated memory:
ptr1(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),1)
ptr2(msize(1),msize(2),msize(3)) => mem3d(msize(1),msize(2),msize(3),2) - fill ptr1 and ptr2 with some data (no segfault)
ptr1(:,:,:) = 1
ptr2(:,:,:) = 2 - loop over i,j,k (no segfault)
ptr1(i,j,k) = ptr2(i,j,k)") - use intrinsic function to copy data (SEGFAULT)
ptr1(:,:,:) = ptr2(:,:,:)
<<
Personally, I think the user should "bite the bullet" and make a user defined type containing an allocatable array (with TARGET).Then point his pointers at that/those.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I can see that the ptr1(:,:,:) = ptr2(:,:,:) assignment would create a temp which is causing the segfault or stack overflow. It may or may not be better if written ptr1=ptr2. A complex run-time check for lack of overlap (taking strides into account) would be needed to avoid the temp.
Steve
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, I can see that the ptr1(:,:,:) = ptr2(:,:,:) assignment would create a temp which is causing the segfault or stack overflow. It may or may not be better if written ptr1=ptr2. A complex run-time check for lack of overlap (taking strides into account) would be needed to avoid the temp.
Steve
Steve,
And my argument is the complex runtime check can be eliminated if there were an attribute or attributesthe programmer can place on the pointer that specifies a) the memory pointed to will always be contiguous and optionallyb) the memory pointed to will never overlap. An alternate way would be to add a compiler directive that accomplishes the same thing as the proposed pointer attributes either placed at the pointer declaration or placed at the appropriate code statement. You could also add a runtime diagnostic option similar to index out of bounds tests that asserts the requirements of the attribute upon execution of => pointer assignments.
I am mearly makingsuggestions for optimization techniques that gives the IVF compiler an advantage over the competitors products.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, there is a -fno-alias switch which might help. I see your point.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, there is a -fno-alias switch which might help. I see your point.
I thought that switch was on by default in Fortran, and that it asserts that code complies with the Fortran standard where subroutine arguments don't alias each other.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I thought that switch was on by default in Fortran, and that it asserts that code complies with the Fortran standard where subroutine arguments don't alias each other.
No, it isn't on by default. The one you're thinking of is -assume dummy_aliases which is indeed off by default.
-fno-alias has to do with pointers. It's really a C-ism, but I have seen it have some effect on Fortran.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I directed the test case and this discussion to our High-level optimization development team for their analysis and opinion on removing the stack temp creation and use. I will follow-up as I learn more. (Internal ref. CQ-50310)
Will there be any more comments from the high-level optimization development team on this. There were a lot of comments on this topic ... but all of them are just work-arounds and no real fix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am not sure I would classify this as a bug with the Intel compiler. Yes, some discussions suggest ways the compiler might avoid using array temps but nothing suggests the compiler was wrong to do so given the potential for overlap and the difficulty in determining such when using POINTER.
While you note that xlf works, we do not know if it also uses array temps for the context in question. It, like other compilers, may default to placing array temps on the heap which avoids exhausting the stack and the ensuing segmentation fault, similar to how ifort avoids this when the -heap-arrays option is used.
Again, I will pass along any updates.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »