Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
29487 Discussions

using openMP results in stack overflow

pilot117
Beginner
1,284 Views
Hi,

I have met a problem while using openmp. Here are my settings, a main program calling a subroutine.

program main
call testopenMP
end program

subroutine testopenMP
parameter n=8000
real, dimenstion (n, n):: matrix1, matrix2
! codes
end subroutine testopenMP

It works if I dont use multi-threading via openmp

however, if my main program change to

program main
integer OMP_GET_THREAD_NUM
integer OMP_SET_NUM_THREADS

call OMP_SET_NUM_THREADS(2)
!$OMP parallel default(private)
call testopenMP
!$OMP END PARALLEL
end program

it gives me segmentation fault. When i reduce the matrix size in the testopenMP to say 800, it works (but for 1000, does not work). So I guess this is due to stack overflow. For C, I can sort of solving it by declare the matrix1 and matrix2 to be global in the main program which wont go into stack. But for FORTRAN, i am not sure how to solve it. I dont know how the arrays are stored.

For me, the variables and arraies declared in the testopenMP seem to be local and so they go to stack of the main program. But when I use multithreads, why the stack of main program can hold a matrix of 8000 by 8000 can not hold two matrices of size 1000 by 1000? I guess there is something i dont know about the array storage. It seems that the arrays are not in the process stack.

can anyone explain the variable, array storage and how they are allocated in runtime in Fortran 90 ?

many thanks in advance!

pilot117
0 Kudos
4 Replies
TimP
Honored Contributor III
1,284 Views
With -openmp set, the local array in the subroutine, which is in effect threadprivate, has to be allocated on thread stack, while without -openmp or -auto, it need not be on stack. If it were a shared array in the main program, stack storage could be avoided.
The KMP_STACKSIZE environment variable or, alternatively, function call, adjust the limit for thread stack size. No doubt, with increased thread stack size, you may have to increase overall stack size in your shell.
0 Kudos
pilot117
Beginner
1,284 Views
Quoting - tim18
With -openmp set, the local array in the subroutine, which is in effect threadprivate, has to be allocated on thread stack, while without -openmp or -auto, it need not be on stack. If it were a shared array in the main program, stack storage could be avoided.
The KMP_STACKSIZE environment variable or, alternatively, function call, adjust the limit for thread stack size. No doubt, with increased thread stack size, you may have to increase overall stack size in your shell.

Ok, I see. thanks for these information. Basically, when the -openmp is passed to ifort, the variables declared in the subroutines go into the stack of the main program.

In stead of increasing the thread stack size, is there anyway to declare the arrays in the callee such that it wont go into the stack of the caller?
0 Kudos
jimdempseyatthecove
Honored Contributor III
1,284 Views

Add the RECURSIVE attribute to the subroutine/function

! VVVVV note recursive
recursive subroutine foo(a)
real :: a
real :: temp(3) ! should be on stack (of thread)
real, save :: asdf(3) ! should be shared by all threads
=========================
subroutine foo(a)
real :: a
real, automatic:: temp(3) !on stack (of thread)
real, save :: asdf(3) ! should be shared by all threads

*** Note, automatic is Intel specific (non-portable)

Jim Dempsey
0 Kudos
tom_p
Beginner
1,284 Views
Quoting - pilot117

Ok, I see. thanks for these information. Basically, when the -openmp is passed to ifort, the variables declared in the subroutines go into the stack of the main program.

In stead of increasing the thread stack size, is there anyway to declare the arrays in the callee such that it wont go into the stack of the caller?

There is only one stack per thread shared by all subroutines, so you cannot have a variable in the stack of the callee without having it in the stack of the caller.

But you can use allocatable arrays which use heap memory and therefore will not clutter the stack of the thread. (You could also try the compiler switch -heap-arrays; but I am not sure whether this affects the array in your case.)

0 Kudos
Reply