- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
i get a segfault with ifort 13.0.1 when running the following program
program tmp
implicit none
integer, parameter:: n=5*10**6
real, pointer:: a(:),b(:)
allocate(a(n))
allocate(b(n))
a=b
end program
The cause seems to be the large size of the arrays. It does not happen for n=10**6.
Regards
Anton
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You very likely have reach the limit of your stack size. Please set this to unlimited: ulimit -s unlimited.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Using "ulimit -s unlimited" solves the problem.
Can this setting have unexpected drawbacks?
And also, are pointer arrays allocated on the stack? I don't observe this problem with allocatable arrays. So they are allocated on the heap, correct?
Many thanks
Anton
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
No, the issue is that a temp had to be created because we can't reliably know that the target of the two pointers do not overlap.
With actual allocatables, we do know that they do not overlap, and so do not need the temp.
By default, however, temps are put on the stack, although this can be changed using the -heap-arrays command line switch.
--Lorri
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the explantion. For very large arrays the creation of a temp array could be undersirable. Is there a way to prevent this? For example using a loop instead of whole array asignment?
Regards
Anton
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Use ALLOCATABLE instead of POINTER unless you will be using pointer assignment. With ALLOCATABLE the compiler should not use a temp.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lorri, Steve,
In this case (a=b where a and b are pointers and memory is stride 1), there is no reason for a temp to be created. When both pointer's strides are 1 call memmove (which could call one of two flavors of __intel_fast_memcpy: front-to-back and back-to-front).
When the pointers are not stride 1, you could fall back to making an extra copy, but I would rather see more work performed to see if the extra copy is absolutely required. IOW only when overlap .AND. when overlap cannot be handled by choice of front-to-back or back-to-front. Such a case might be where you have an overlap and different strides.
Your compiler optimization team goes to extremes to get the most efficient vectorization, this issue would be relatively trivial to resolve. Considering that the current method performs a memory R+W+R+W as opposed to R+W, it would appear to be "low hanging fruit" as an optimization candidate.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Jim, in most cases use of ALLOCATABLE arrays is the better solution. The complex code needed to do the additional checking and various choices would tend to inhibit optimizations. Fortran 2008 adds the CONTIGUOUS attribute which could be used by the compiler to eliminate the temp. As it happens, the current version doesn't do that but a future version will. The programmer should choose the appropriate language syntax.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page