Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Ankündigungen
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
29285 Diskussionen

Stack Overflow during running parallel FORTRAN code

Mohammadreza_S_
Einsteiger
2.167Aufrufe

Hi

I have a code in FORTRAN and it runs sequentially without problem (I compile it with /O3 and x64 platform). Then I add OpenMp syntaxes to make the code more optimized. This time it gives me "Stack overflow" message (even if I run it ). I increased stack reserve size to about 1GB but it does not work.

Here is part of the code that change to make it parallel: 

    call OMP_SET_NUM_THREADS(6);
    !$OMP PARALLEL DO DEFAULT(PRIVATE) SHARED(g_num,g_coord,nn,nels,anatyp)   &   
    !$OMP SHARED(coord_elm_center,loc_ele_cor,wix4,der4,fun4,Shear_Skeleton)  &
    !$OMP SHARED(EleMode,v,vu,Biot_Coef,c,dtim,permx,Kr,cT,gam,omg2,gcor8)    &
    !$OMP SHARED(wix8,eqn,counter1,counter2,counter3,counter4,counter5,Th_Exp)&
    !$OMP SHARED(lan,lan1,der8,fun8,gcor20,wix20,fun20,der20,gcor40,wix40)    &
    !$OMP SHARED(fun40,der40,gcor61,wix61,fun61,der61) SCHEDULE(DYNAMIC)      &
    !$OMP REDUCTION (+:Lhs,LhsSig,LhsU,A15,A25,A35,A45,A55_Heat)
    Main: do iel=1, nels
 $  DO SOEM CALCULATION

    enddo Main;
    !$OMP END PARALLEL DO 

I appreciate any help.

 

0 Kudos
2 Lösungen
Steven_L_Intel1
Mitarbeiter
2.167Aufrufe

You need to set the environment variable KMP_STACK_SIZE to a larger value (but not 1GB!) - this is the per-thread stack size. I suggest a somewhat lower stack reserve size - try 100000000 to begin with.

Lösung in ursprünglichem Beitrag anzeigen

Michael_Roberts
Neuer Beitragender I
2.167Aufrufe

Hi,

I had this issue when I had a large, private, array. When entering the parallel zone a local copy was created on each threads stack causing the overflow.

My solution was to allocate the array dynamically before the parallel section, with an additional dimension, allocated to the number of threads. This can then be a shared array between all threads (ie no copy created on stack) where each thread accesses its own slice using the function OMP_GET_THREAD_NUM() + 1  (where the '+1' is because this function is zero based, not one based).

 

 

Lösung in ursprünglichem Beitrag anzeigen

6 Antworten
Steven_L_Intel1
Mitarbeiter
2.168Aufrufe

You need to set the environment variable KMP_STACK_SIZE to a larger value (but not 1GB!) - this is the per-thread stack size. I suggest a somewhat lower stack reserve size - try 100000000 to begin with.

Mohammadreza_S_
Einsteiger
2.167Aufrufe

Hi Lionel

Can you help me how to do that?

 

Michael_Roberts
Neuer Beitragender I
2.168Aufrufe

Hi,

I had this issue when I had a large, private, array. When entering the parallel zone a local copy was created on each threads stack causing the overflow.

My solution was to allocate the array dynamically before the parallel section, with an additional dimension, allocated to the number of threads. This can then be a shared array between all threads (ie no copy created on stack) where each thread accesses its own slice using the function OMP_GET_THREAD_NUM() + 1  (where the '+1' is because this function is zero based, not one based).

 

 

Chris_G_2
Einsteiger
2.167Aufrufe

I had exactly the same problem recently, and solved it in a similar way as Michaael Roberts.

Chris G

Mohammadreza_S_
Einsteiger
2.167Aufrufe

Thank you Michael, Chris and Steve. I have increased KMP_STACKSIZE to 999M but It does not solve the problem. I think the best way is the way that Michael describes. I will do this and inform you. Thanks.

TimP
Geehrter Beitragender III
2.167Aufrufe

KMP_STACKSIZE (or, using the standard name, OMP_STACKSIZE) defaults to 4MB on Intel 64-bit targets.  A typical setting, when default isn't sufficient, is 9MB. When Steve said don't use 1GB I doubt he meant 999MB.  I haven't heard of any application where more than 40MB is required.  You ought to be able to estimate how much space is required for your private arrays by multiplying data size by number threads.

When you set KMP_STACKSIZE=999MB you risk adding 1GB times number of threads to the allowance you would require in /link /stack, which would put a low limit on number of threads. I don't know specifically for your platform, but I wouldn't count on being able to increase effective stack reserve to as much as 16GB (note that Steve suggested a more modest value).
 

Antworten