Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

-heap_arrays option, thread safety and performance

Amalia_B_
Beginner
915 Views

I'm updating old .f90 code to f2003 and in order to make thread-safe libraries.  I've come across this old topic:

https://software.intel.com/en-us/forums/topic/270572

in listing a few guidelines for writing thread-safe code.  One of the libraries I've updated uses very large arrays, which I've been able to get working when using the -heap_arrays compiler option. Otherwise I get a stack overflow error in the MAXVAL intrinsic function.  But the above topic states to not use the -heap_arrays option for thread-safe code. 

Is this true?  I'm using Intel® Parallel Studio XE 2015 Update 4 Composer Edition for Fortran Windows* Integration for Microsoft Visual Studio* 2013, Version 15.0.0122.12

I know previous versions of IVF had memory leak problems with this option but have since been fixed. 

Also, this particular library is being executed across multiple threads (2-8) 1000s of times and I do see a noticeable slow down when I get to the 1000+ run/thread.  Is this due to the -heap arrays option for creating the library? or would this be attributable to a memory leak problem? 

0 Kudos
2 Replies
TimP
Honored Contributor III
915 Views

Some of those questions will be difficult to answer with any generality.

Maxval doesn't inherently require a temporary array.  I suppose it is due to complexity aof the arguments.  Resolving this could be important for performance.

Up to  a point, stack overflow might be resolved simply by adjusting stack limits, both those in visual studio longer properties and omp_stacksize (the latter for individual threads). Note that default and maximum practical limits are larger in 64 bit mode.  These questions are unavoidable in thread parallel applications of any significant size.

At the same time, threaded performance scaling may be limited by unnecessary private data arrays, including these implicit ones.

I would like to see more expert discussion of the point Ron raised 4 years ago. In principle, any implicit array generated in a parallel region should be automatically private if proper compile options are set. RECURSIVE procedure declaration ought to make sure of it so heap-arrays doesn't cause a race condition.

allocation and reallocation of private heap arrays is likely to be serialized, thus limiting parallel performance scaling. In the past there may have been bugs in implicit deallocation.

0 Kudos
Calvin_D_R_
New Contributor I
915 Views

Thanks for your posts, Amalia and Tim. As a result of them, I've found the bug in one of my routines that I've been chasing for a week.

The routine uses a number of moderate-sized arrays and has an OMP Parallel Do (DO i=1,4). Several of the arrays were in the PRIVATE statement. Compiler options were set for the use of HEAP.

The routine always ran to completion, but, roughly 2-3% of the time, it gave nonsense results. It never failed to give the correct answer when I disabled the Openmp statements, and it never failed in debug.

After reading your posts, I found that using the stack, instead of the heap, cured the problem. Unfortunately, for large arrays, I got stack overflows. Finally, I solved the problems by giving an extra dimension of 4 to each of the arrays that had been in the PRIVATE list, so that each i got its own arrays. Using HEAP now causes no problem.

That 'bug' was a nasty one for me.

Don

0 Kudos
Reply