Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

heap arrays with OpenMP?

joseph-krahn
New Contributor I
835 Views

OpenMP can use a lot of stack space, and it can often depend on the size of the problem input to a program. I think it would be very useful to support the -heap-array option for OpenMP temporaries. The source code can always be modified to allocate heap memory within the parallel region, but this would minimize code changes to a non-parallel program, and would also allow "-heap-arrays " to switch to heap only for large temporaries.

OTOH, maybe huge stacks will not be an issue as parallel becomes more common. On an x86_64 system with 8 cores and only 6G RAM, I can run a 16-thread OpenMP test program with OMP_STACKSIZE="16G".

Joe Krahn

0 Kudos
5 Replies
Ron_Green
Moderator
835 Views

Joe,

It's an interesting suggestion, and one I'll bring up with our engineers. The knee-jerk reaction will undoubtably be "but anything on heap can be accessed (hence potentially corrupted) by all threads. There is not enough protection on heap to protect from one thread corrupting another's data on heap".

And I suspect there would be a fair amount of work for each thread private var access to determine: 1) is this on stack or heap? If on heap, where did I put it for THIS thread (some sort of index perhaps??).

The implementation might get pretty complex.

But I will ask if this is possible.

ron

0 Kudos
Steven_L_Intel1
Employee
835 Views
I don't think the corruption issue is the issue - the threads are sharing an address space. I can't think of any reason why this couldn't work, and to be honest, I had thought it did. The pointer to the array would be thread-local even if the storage was on the heap.
0 Kudos
Grant_H_Intel
Employee
835 Views
I agree with Steve about corruption not being an issue.Threads can also access other threads' stacks although it is not allowed by the OpenMP specification if those variables are declared "private" or "threadprivate". The trade-off in my opinion would be which is more efficient? Typically, allocating arrays on the stack is much more efficient than heap-allocated storage because no synchronization is needed to do the allocation and deallocation of memory. Since you can set stacksizes nearly arbitrarily large on Intel 64 architectures (which have a huge virtual memory address space), I don't think stack size isthat much of an issue.

Hope this helps,
0 Kudos
Steven_L_Intel1
Employee
835 Views
You may be able to set the stack arbitrarily large on Linux, but certainly not on Windows and possibly not on Mac OS.
0 Kudos
Martyn_C_Intel
Employee
835 Views
I recently came upon an example (on Windows) where memory was allocated explicitly (using ALLOCATE) in afunction that was called from withinan OpenMPparallel region, to be thread safe without increasing the stack requirement. There was a performance penalty that increased as the number of threads increased, due to the serialization of the allocations. The solution was to go back to declaring local variables that were allocated on the stack; since each thread has its own stack,this can happen in parallel.
0 Kudos
Reply