- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I writing a Monte-Carlo random walk simulation, and if I do it with a large number of particles (300,000), I get a stack overflow error. I am confused as to what that error may signify.
Here are the gory details:
The error occurs in the routine execute_simulation (in red) in the following module (I include the module header). It occurs during the first loop of that routine. The routine execute_simulation is called from the main program.
module simulation_class
use misc_math_utilities
use particle_cloud_class
implicit none
type simulation
integer :: file_unit
REAL :: DT=1e-6
integer :: cMax_part=10,cMax_steps=10,iStep,seed_arr(2)=(/1,1/)
type (Particle_cloud) ::oParticle_cloud
real, dimension(3)::vRinit,vVinit
integer cPart
logical fPrint_av_KE
end type simulation
interface create_obj
module procedure create_simulation_obj
end interface
interface print_obj_def
module procedure print_simulation_obj_def
end interface
... other routines omitted ...
subroutine execute_simulation(self)
type (simulation) self
integer step
do step=1,self%cMax_steps
call take_step(self%oParticle_cloud)
call print_stats(self%oParticle_cloud,step)
end do
end subroutine execute_simulation
end module
The take_step routine is in the following module
MODULE particle_cloud_class
use misc_math_utilities
type particle_cloud
integer Count
real,allocatable::mStorage(:,:)
end type particle_cloud
integer iPart ! used for looping through particles
integer::viCoords(3)=(/1,2,3/),viVel(3)=(/4,5,6/)
interface create_obj
module procedure create_particle_cloud_obj
end interface
interface print_obj_def
module procedure print_particle_cloud_def
end interface
interface print_stats
module procedure print_R_stats
end interface
... stuff omitted
subroutine take_step(self)
type (particle_cloud)::self
real::temp(3)
self%mStorage(viCoords,:)=self%mStorage(viCoords,:)+self%mStorage(viVel,:)
do iPart=1,self%Count
temp=rn_point_on_sphere()
self%mStorage(viVel,iPart)=temp
end do
end subroutine t ake_step
where rn_point_on_sphere() is a function that returns a 3-element vector of a random point on a sphere. I have tested that generator for up to 10^8 calls, and it seems to work ok.
I am not sure how to go and debug that error. Thanks for any pointers and suggestions.
Mirko
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Mirko,
Perhaps the compiler is creating an unnecessary temporary variable to perform the integration step addition.
Try replacing the array copy with equivelent loop:
subroutine take_step(self)
type (particle_cloud)::self
real::temp(3)
do iPart=1,self%Count
self%mStorage(viCoords,iPart)=self%mStorage(viCoords,iPart)+self%mStorage(viVel,iPart)
end do
do iPart=1,self%Count
temp=rn_point_on_sphere()
self%mStorage(viVel,iPart)=temp
end do
end subroutine take_step
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Additional Information:
If the loop replacement from the prior post corrects the Stack Overflow problem then at some point in the future you may want to consider optimizing the Euler integration. Create a subroutine that redefines the array of XYZ vectors into a rank 1 array of reals. Then perform the array incriment on the rank 1 array of reals. The compiler can then optimize the code for the size of real and if you choose SSE3 optimizations.
Jim
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I will play a bit with setting the stack size, and the suggestions of the two other posters.
Mirko
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Your suggested modification did not help.
I should point out that the error is reported where the take_step routine is called.
The caller routine looks like this:
subroutine execute_simulation(self)
type (simulation) self
integer step
do step=1,self%cMax_steps
call take_step(self%oParticle_cloud)
call print_stats(self%oParticle_cloud,step)
end do
end subroutine execute_simulation
and the run-time stack overflow error points to the highlited line.
I have already encountered a similar problem in the same code. There was an error in a routine that calculates a random number (sqrt of a negative number). However, the run-time error was pointing to the statement that calls this routine. I have sent the relevant code the intel support.
Mirko
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
-Qoption,f,"-heap_arrays 0"
and it will enable the feature.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
MADsblionel:I mentioned earlier that the compiler would soon have an option to allocate temporaries on the heap. The good news is that it does, as of 9.1.029. The bad news is that support for the new switch, /heap-arrays as described in the release notes (which none of you read, apparently...), was inadvertently left out of the command driver in this release. It will get fixed for the next one. In the meantime, you can add this to the command line options:
-Qoption,f,"-heap_arrays 0"
and it will enable the feature.
I tried both options: increased the stack size, and separately invoked the switch. Both worked. But that should not be news to you -- you expected it to work :-)
Thank you
Mirko
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
So the runtime _complaint_ went away but the underlaying problem did not.
The problem is likely unnecessary temporaries being created. These not only consume stack space but creates excessive call overhead.
It would be nice if the compiler had an option to issue an information message when it creates a temporary array. The runtime system can report a warning on some calls. IMHO the warning needs to be issued at compile time too.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"It would be nice if the compiler had an option to issue an information message when it creates a temporary array."
-check:arg_temp_created has been discussed in the Fortran forum, along with invitations to submit Premier reports of cases where the compiler should recognize the temporary is unnecessary. I've submitted several such, after checking the effect on performance. It's not always evident whether the temporary improves or degrades performance. Real cases from more customers ought to raise the priority on this.
Also, there are cases where the temporary is loop invariant, so it could be created and destroyed outside an inner loop, likely making it beneficial for performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Coding changes to recognize unnecessary temps may be a substantial effort. And the payback (to Intel) might not be worth the effort.
Adding a "-check:arg_temp_created" would be relatively easy to do. And the payback to an individual customer might be significant.
There are not only performance issues but some coding issues as well. Assume a temporary is created (when it need not be) and the address is passed on to a function or subroutine. Further assume the temporary was derived from an array which is shared in OpenMP. In this case you have the opportunity to have multiple and different instances of the same data.
It would be nice to have a compiler warning so I could be informed if temporaries are created. This would save me a lot of debugging effort.
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Eliminating temps in assignments is a big performance win, and it is something we are constantly improving. The analysis can sometimes be tricky.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It's nice to hear you're trying to eliminate unnecessary temps. I've given up using array sectoring on anything except arrays with known and small dimensionsbecause of the stack overflows. Allocation on the heap rather than the stack won't help me: since I do what I can to optimize my own usage of the heap, that will just convert my stack overflows into stack-heap collisions. I've been writingutility subs as necessary to manipulate array sectors as whole arrays when possible and indexing them myself when not.I haven't had any problems since I began doing so.
Bruce
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve:
/check:
check run-time conditions
keywords: all (same as /4Yb), none (same as /nocheck, /4Nb),
[no]arg_temp_created, [no]bounds,[no]format,
[no]output_conversion, [no]power, [no]uninit, [no]args
This is a run time check
It would be much nicer to have a compile time check issue an information messagesuch that I can use the IDE to fix the source file(s).
Receiving:
forrtl: warning (402): fort: (1): In call to AVSETVIEWPOINT, an array temporary
was created for argument #2
Helps only if I have a console window and look at the console window. Which may be flipping a whole bunch of other stuff.
In looking at the above message which of my 700+ source modules has the problem call???
Jim Dempsey
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a run-time check because the compiled code does a run-time test to see if the argument is contiguous. If it is, then no temp is created and no warning. The accompanying trraceback should identify the location.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve:
I think I missed something important in the discussions over the past few months. Are you saying that
a(:) = b(kf:kl,indx)
will generate stack temps, but
p = Dot_Product(r(jf:jl,indr),s(kf:kl,inds))
will not?
Bruce
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For the second, a lot might depend on what the values of indr and inds are and what the declarations of the variables are.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Steve:
I seem to have missed something big in the discussions about stack temps over the past few months. Are you saying that
a(:) = b(kf:kl,indx)
will generate stack temps, but
p = Dot_Product ( r(jf:jl,indr) , s(kf:kl,inds) )
will not?
Bruce
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Did you intend to repost your earlier question?
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page