- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have a Fortran dll which is called from a Java program and am encountering an EXCEPTION_STACK_OVERFLOW condition when my arrays are 'large' (size = 80000). The problem occurs when I use array pointers in code like:
REAL, DIMENSION(:), POINTER :: a,b,c
a = b+c
However, the problem does not occur if I write an explicit DO-loop or use an allocatable array for temporary storage, with code like:
REAL, DIMENSION(:), POINTER :: a,b,c
REAL, DIMENSION(:), ALLOCATABLE :: tem
n = SIZE(a); ALLOCATE(tem(n))
tem = b+c; a=tem; DEALLOCATE(tem)
I can also work around the problem by increasing the stack size of jave.exe using editbin, but that is not a general solution for this application.
Is there a general issue with the stack and pointers in array statements? And is there any way around it other then rewriting my code?
Doug Henn
I have a Fortran dll which is called from a Java program and am encountering an EXCEPTION_STACK_OVERFLOW condition when my arrays are 'large' (size = 80000). The problem occurs when I use array pointers in code like:
REAL, DIMENSION(:), POINTER :: a,b,c
a = b+c
However, the problem does not occur if I write an explicit DO-loop or use an allocatable array for temporary storage, with code like:
REAL, DIMENSION(:), POINTER :: a,b,c
REAL, DIMENSION(:), ALLOCATABLE :: tem
n = SIZE(a); ALLOCATE(tem(n))
tem = b+c; a=tem; DEALLOCATE(tem)
I can also work around the problem by increasing the stack size of jave.exe using editbin, but that is not a general solution for this application.
Is there a general issue with the stack and pointers in array statements? And is there any way around it other then rewriting my code?
Doug Henn
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, there is a general issuewith pointers here in that without aliasing analysis the compiler doesn't know whether a, b, and c point to overlapping sections of the same array or whether they are independent. Since the standard specifies that array assignment proceeds as if a temporary copy is made of the expression on the right hand side of the equals sign and then this is copied what the pointer on the left hand side points to, the compiler really has to make copies here absent this potentially complicated aliasing analysis. The DO-loop solution can also be somewhat inefficient because the compiler can't unroll the loop and carry out some of the assignments in parallel; it must generate completely sequential code. The most efficient thing you could do is to write an internal subroutine:
Since the arrays in the subroutine are not pointers or targets, the compiler knows that they aren't aliased and since they aren't assumed shape arrays it knows their stride is unity, so it will generate the most efficient code it can to add the arrays. I think the most recent versions of CVF won't make copies when the subroutine is invoked via call addem(a,b,c,size(a)) provided all the pointers actually point to contiguous arrays; if this is not the case or if your arrays aren't always contiguous, you could make the dummy arguments assumed shape and omit the fourth argument which specifies the size of the arrays. This would be a little less efficient because the compiler would have to compute the index of each array element separately because it doesn't know the stride at compile time and also couldn't do useful prefetching for the same reason, but it would never make copies.
subroutine addem(a,b,c,n) integer, intent(in) :: n real, intent(in) :: b(n), c(n) real, intent(out) :: a(n) a = b+c end subroutine addem
Since the arrays in the subroutine are not pointers or targets, the compiler knows that they aren't aliased and since they aren't assumed shape arrays it knows their stride is unity, so it will generate the most efficient code it can to add the arrays. I think the most recent versions of CVF won't make copies when the subroutine is invoked via call addem(a,b,c,size(a)) provided all the pointers actually point to contiguous arrays; if this is not the case or if your arrays aren't always contiguous, you could make the dummy arguments assumed shape and omit the fourth argument which specifies the size of the arrays. This would be a little less efficient because the compiler would have to compute the index of each array element separately because it doesn't know the stride at compile time and also couldn't do useful prefetching for the same reason, but it would never make copies.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the suggestion. This fixed the stack problem and timing tests indicate it's around 40% faster than my orignal code.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page