Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

Stack size is not big enought

Christoph_I_
Beginner
5,547 Views

Hello!
I have an issue regarding stacksize and ifort (Parallel Studio XE Composer). For my computations I divide an huge array into small pieces and every node submitted to the run does the computations for one part. Below a chunk of code:

real         ::  p(200,200,400)
integer      ::  ib,ie,jb,je,kb,ke 
...
ib=1;ie=199
jb=2;je=198
kb=2;ke=398
call  SOLVE_POI_EQ(rank,p(ib:ie,jb:je,kb:ke),R)

The problem here is that when I reduce the number of nodes, the code crashes with an `Segmentation Fault` when I call `SOLVE_POI_EQ`. I use linux and when I set the stack size to unlimited: `ulimit -s unlimited` it works.

I'm now worried that I overwrite parts of my OS (can that happen?)!

Is there a better way to address this issue?

0 Kudos
24 Replies
SergeyKostrov
Valued Contributor II
1,097 Views
>>Intel advice from not too long ago recommended against using heap inside OpenMP parallel regions. Tim, that looks very interesting and contradicts with my experience of allocation Heap Based memory and using it inside of OpenMP parallel regions! I never had any issues or problems. Where did you read about it? Could post some additional information about that advise?
0 Kudos
Christoph_I_
Beginner
1,097 Views

Hello,

first of all I want to thank you because my code works now perfectly (at least I thought that till now!).

I recently tried this debugging and mapping too Allinea (http://www.allinea.com/) which shows my the potential bottlenecks of my MPI code. There I get as an output that my program uses 200Mb of memory. But when I write out most of may data in the end and sum the file sizes up I get 3Gb. I store all my data in binary format. Is that possible because I fear that due to this -heap-array flag I use the hard-disk memory instead of the RAM.

And I have another question regarding my output routine. I see that the most time consuming part is my output and especially the close-statement. Is there a more efficient way with the Intel compiler?

Thank you in advance!

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,097 Views

The problem with enlarging stack size is exemplified by your second paragraph. When you set aside stack size, you do this for all threads. When only one, or possibly a few threads, require large stack, this can potentially cause issues (out of memory, excessive page file/RAM footprint). Using the heap will add some overhead, but will (can) reduce the page file/RAM requirements.

IMHO (?was I ever humble?) it is best to take the responsibility of placing the storage. Small items (arrays, user defined types) on stack, what are, or can become, large items specifically allocated.

Jim Dempsey

0 Kudos
sahar_h_
Beginner
1,097 Views

 

The problem with enlarging stack size is exemplified by your second paragraph. When you set aside stack size, you do this for all threads. When only one, or possibly a few threads, require large stack, this can potentially cause issues (out of memory, excessive page file/RAM footprint). Using the heap will add some overhead, but will (can) reduce the page file/RAM requirements.

0 Kudos
Reply