- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi
I have a code in FORTRAN and it runs sequentially without problem (I compile it with /O3 and x64 platform). Then I add OpenMp syntaxes to make the code more optimized. This time it gives me "Stack overflow" message (even if I run it ). I increased stack reserve size to about 1GB but it does not work.
Here is part of the code that change to make it parallel:
call OMP_SET_NUM_THREADS(6);
!$OMP PARALLEL DO DEFAULT(PRIVATE) SHARED(g_num,g_coord,nn,nels,anatyp) &
!$OMP SHARED(coord_elm_center,loc_ele_cor,wix4,der4,fun4,Shear_Skeleton) &
!$OMP SHARED(EleMode,v,vu,Biot_Coef,c,dtim,permx,Kr,cT,gam,omg2,gcor8) &
!$OMP SHARED(wix8,eqn,counter1,counter2,counter3,counter4,counter5,Th_Exp)&
!$OMP SHARED(lan,lan1,der8,fun8,gcor20,wix20,fun20,der20,gcor40,wix40) &
!$OMP SHARED(fun40,der40,gcor61,wix61,fun61,der61) SCHEDULE(DYNAMIC) &
!$OMP REDUCTION (+:Lhs,LhsSig,LhsU,A15,A25,A35,A45,A55_Heat)
Main: do iel=1, nels
$ DO SOEM CALCULATION
enddo Main;
!$OMP END PARALLEL DO
I appreciate any help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to set the environment variable KMP_STACK_SIZE to a larger value (but not 1GB!) - this is the per-thread stack size. I suggest a somewhat lower stack reserve size - try 100000000 to begin with.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I had this issue when I had a large, private, array. When entering the parallel zone a local copy was created on each threads stack causing the overflow.
My solution was to allocate the array dynamically before the parallel section, with an additional dimension, allocated to the number of threads. This can then be a shared array between all threads (ie no copy created on stack) where each thread accesses its own slice using the function OMP_GET_THREAD_NUM() + 1 (where the '+1' is because this function is zero based, not one based).
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You need to set the environment variable KMP_STACK_SIZE to a larger value (but not 1GB!) - this is the per-thread stack size. I suggest a somewhat lower stack reserve size - try 100000000 to begin with.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lionel
Can you help me how to do that?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I had this issue when I had a large, private, array. When entering the parallel zone a local copy was created on each threads stack causing the overflow.
My solution was to allocate the array dynamically before the parallel section, with an additional dimension, allocated to the number of threads. This can then be a shared array between all threads (ie no copy created on stack) where each thread accesses its own slice using the function OMP_GET_THREAD_NUM() + 1 (where the '+1' is because this function is zero based, not one based).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I had exactly the same problem recently, and solved it in a similar way as Michaael Roberts.
Chris G
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you Michael, Chris and Steve. I have increased KMP_STACKSIZE to 999M but It does not solve the problem. I think the best way is the way that Michael describes. I will do this and inform you. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
KMP_STACKSIZE (or, using the standard name, OMP_STACKSIZE) defaults to 4MB on Intel 64-bit targets. A typical setting, when default isn't sufficient, is 9MB. When Steve said don't use 1GB I doubt he meant 999MB. I haven't heard of any application where more than 40MB is required. You ought to be able to estimate how much space is required for your private arrays by multiplying data size by number threads.
When you set KMP_STACKSIZE=999MB you risk adding 1GB times number of threads to the allowance you would require in /link /stack, which would put a low limit on number of threads. I don't know specifically for your platform, but I wouldn't count on being able to increase effective stack reserve to as much as 16GB (note that Steve suggested a more modest value).
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page