- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
A small test program seems to segfault on thefollowing line:
angles = pi*real( [(i,i=0,n-1)]/n, kind=dp)
for n larger than roughly 2000000.
ulimit is set to unlimited, ifort version 11.0 20081105 on fedora 10 and ifort version 10.1 20080312 on openSUSE 10.2, I've also tried it with using big integers, but it still segfaults. Replacing the above line by a 'classic loop' removes the segfault.
Before I openan issue about this segfault at premier.intel.com, I would like to know if I'm not doing somethingincredibly stupid here.
Greetings,
Wim
$ cat fortran_sin.f90
program fortran_sin
implicit none
integer,parameter :: dp = selected_real_kind(p=13,r=300)
integer,parameter :: n = 10000000
integer :: i
real(dp) , parameter :: pi = 3.1415926535897932385_dp
real(dp) :: angles(n),sins(n)
angles = pi*real( [(i,i=0,n-1)]/n , kind=dp ) !this line gives segfaults for big values of n
! do i=0,n-1
! angles(i+1) = pi*real(i/n,kind=dp)
! end do
sins = sin(angles)
end program fortran_sin
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
A small test program seems to segfault on thefollowing line:
angles = pi*real( [(i,i=0,n-1)]/n, kind=dp)
for n larger than roughly 2000000.
ulimit is set to unlimited, ifort version 11.0 20081105 on fedora 10 and ifort version 10.1 20080312 on openSUSE 10.2, I've also tried it with using big integers, but it still segfaults. Replacing the above line by a 'classic loop' removes the segfault.
Before I openan issue about this segfault at premier.intel.com, I would like to know if I'm not doing somethingincredibly stupid here.
Greetings,
Wim
$ cat fortran_sin.f90
program fortran_sin
implicit none
integer,parameter :: dp = selected_real_kind(p=13,r=300)
integer,parameter :: n = 10000000
integer :: i
real(dp) , parameter :: pi = 3.1415926535897932385_dp
real(dp) :: angles(n),sins(n)
angles = pi*real( [(i,i=0,n-1)]/n , kind=dp ) !this line gives segfaults for big values of n
! do i=0,n-1
! angles(i+1) = pi*real(i/n,kind=dp)
! end do
sins = sin(angles)
end program fortran_sin
The problem is that you are generating a HUGE array temporary for the expression on the right hand side. By default, this will be built on the stack. You can use
-heap-arrays
to get this to use heap instead of stack and avoid the issue. But is informative to know that you are creating a huge array temporary and perhaps this should be avoided.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem is that you are generating a HUGE array temporary for the expression on the right hand side. By default, this will be built on the stack. You can use
-heap-arrays
to get this to use heap instead of stack and avoid the issue. But is informative to know that you are creating a huge array temporary and perhaps this should be avoided.
Ah thanks, I knew I overlooked something simple.
It would indeed be informative to know when a program is creating large array on the stack.
Thanks,
Wim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-check arg_temp_created
This one is useful since in many cases these argument temporaries can be removed by using proper INTERFACEs for the procedures. ( -gen-interfaces is a really powerful feature that not enough users take advantage of)
In general, there are cases where the compiler is fully justified to use temporaries - it is often the most efficient and straightforward way to perform the operation. For example, in your expression:
angles = pi*real( [(i,i=0,n-1)]/n , kind=dp )
One way of doing this without the temp is to take i=0, divide by n, convert to real, mult by pi, store to angles(1). Then do the same for i=1, then i=2. This is not a very efficient way to implement this expression. Modern processors are highly pipelined, and you can imagine that this sequence of operations is not utilizing any streaming of memory, any pipelining in the FP hardware.
Now, imagine you create the entire array of I values. You stream that through the FP units to do the DIV operation, then stream through the REAL conversion, then stream through the pi* operation and as the pi*X results come out, you stream them out to consecutive memory for ANGLES.
So array temporaries do have their place, and are often exactly what you want.
We provide the -array-temps to use in cases where the user data is quite large and will not fit on stack. Stack is nice to use, since memory management is fast and efficient (push/pop, nothing more simple and efficient as this). Heap, there is a little more overhead as the runtime must manage the heap space to prevent fragmentation and exhaustion. We often debate whether it is better to default to heap temporaries (as many other compilers do) so that we no longer see the error you and others encounter with large arrays - and provide an option like "-stack-arrays" as a non-default option for performance critical applications. In general, the Intel philosophy is to default to speed and efficiency and this is an ideal example of this design philosophy.
Again, we do have the request to consider an option to warn the user on all temporary creation, and we continue to debate the merits of heap temps as default. This is an interesting topic.
ron
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Infomative post Ron,
While you are making improvements to array temporaries I would like to see a /warn:array_temporaries so I can see a compiler report. I find the run-time report ineffectual for program development. The run-time report may be suitable during profiling but having a compile time report could nip the problem in the bud.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
angles = pi*real( [(i,i=0,n-1)]/n , kind=dp )
the code I would like to see generated would neither have no array temporary nor would it have a full-length array temporary like the code currently being generated. I would like to see it generate code something roughly equivalent to
istep=1024 ! the optimal size of this step may be subject to debate
do istart=0,n-1,istep
istop=min(istart+istep,n)-1
angles(istart+1:istop+1) = pi*real( [(i,i=istart,istop)]/n , kind=dp )
end do
This would limit the size of the array temporary to something much less likely to cause a segfault or overflow the stack, while getting most of the benefit of using the highly-pipelined processor. This is the kind of loop "chunking" that is done for processors with array registers, so there's plenty of literature on doing this kind of code generation, but I have no idea whether the current Intel development team has any experience in this area.
-Kurt
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
angles = pi*real( [(i,i=0,n-1)]/n, kind=dp)
Simpler is:
[cpp] angles = 0[/cpp]
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page