Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.

Another stack-overflow with IF XE 2016

Tobias_Loew
Novice
241 Views

I did regression tests for switching from IF 2013 to IF 2016 and suddenly a subroutine that took 11K of stack now wants 3M of stack. Please note that this IS NOT LIKE https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/590249 since I neither have dynamic allocated arrays nor doing array-operations on large arrays. I could track down a strange effect where duplicating a single function call adds 128K of stack-space.

Changing Fortran > Optimization > Heap Arrays to 0 as suggest in https://software.intel.com/en-us/forums/intel-visual-fortran-compiler-for-windows/topic/590249 didn't change anything.

Please refrain from proposing to increase the total stack-space of the process: this is a problem of the compiler not the OS

0 Kudos
9 Replies
jimdempseyatthecove
Honored Contributor III
241 Views

If temporary (stack) arrays are not the issue (stated in your post), Then be aware that the newer Fortran standard specifies that locally declared arrays are automatic arrays. In prior versions these were (or may have been) SAVE arrays. You will have to determine if this is the cause of the consumption of stack. More important (if multi-threaded application), you have to determine if this data must be SAVE or must not (necessarily) be SAVE.

Jim Dempsey

0 Kudos
Tobias_Loew
Novice
241 Views

The code is not part of paralell execution and /Qauto is in the project-options (in IF 2013 and IF 2016) version. I will try to set up a minimal example...

0 Kudos
jimdempseyatthecove
Honored Contributor III
241 Views

The /Qauto places locally declared arrays on stack.

/Qsave places locally declared arrays in static location (like SAVE attribute)

Earlier versions of Fortran defaulted to /Qsave, newer standards specify /Qauto as default.

You also have a runtime diagnostic to report if an array temporary is created, You might try that.

Jim Dempsey

0 Kudos
Steven_L_Intel1
Employee
241 Views

A test case would be appreciated.

0 Kudos
Tobias_Loew
Novice
241 Views

I made a little example which looks very innocent but takes 1.2M stack-space (x86, debug-build, in a default VS2015/IF2016 FORTRAN-console project, not a single option changed)

module source

integer*4 i
integer*4, parameter :: largearray(50000) = (/(42,i=1,50000) /)    

contains
    

subroutine heap_test(a,b,c,d,e,f)
integer*4, intent(in), value :: a,b,c,d,e,f


end subroutine 

end module
    
    
subroutine test()
use source
    call heap_test(largearray(452),largearray(4532),largearray(4152),largearray(4552),largearray(4582),largearray(45))
end
    
    
program Console
implicit none
    call test()
end program Console

Copying line 20 results in additional 1.2M stack-usage for each copy. It seems like each time an element from a parameter-array is used, the compiler reserves space for the whole array on the stack.

 

 

0 Kudos
Tobias_Loew
Novice
241 Views

By the way: standard release-builds of the above code (even with /O3) show the same behavior (same for x64), they just crash with a stack-overflow. So IMO this is a very serious security and performance bug in the IF2016 compiler which should be fixed immediately.

 

0 Kudos
Lorri_M_Intel
Employee
241 Views

First, thank you so much for the small reproducer.

What is provoking your stack overflow is the "value" attribute on the dummy argument (line 10).

The semantics of "value" state that the contents cannot be modified, and so to do that, we make a copy.

There was clearly a bug in the compiler for a brief period of time where we made a copy of the WHOLE array, not just the one element in question.

I do not see that behavior in our current development compiler (which is virtually identical to the Update release that should be available soon)  but I have not found our internal edit that resolved that problem either.

I can readily reproduce the incorrect behavior in the released version installed on my machine.

In the meantime, so that you can continue actually being productive, either remove the "value" from your declaration, or use the command line option /assume:nostd_value  (you will have to manually add it to Properties->Fortran->Command Line).

And, when I have access to the Update release, I'll test your program with it.

            Regards -

                                   --Lorri

0 Kudos
Tobias_Loew
Novice
241 Views

Thanks a lot for the quick answer.

regards

Tobias

0 Kudos
Tobias_Loew
Novice
241 Views

While playing around with the example code, I found out some additional defects the should be solved (all only appear if /assume:nostd_value  is NOT used):

assume heap_test like in the example above

1. if an actual argument to heap_test is an element from a parameter-array, then the whole array is locally copied onto the stack, and for each usage as parameter it generates another copy (even in release builds with /O3 !)

2. if heap_test is called multiple times from the same routine, the space for the temporary copies isn't reused (also with /O3)

3. the optimizer (even with /O3 !!!) is unable to detect that the calling subroutine "test" is empty: it won't call heap_test but it does lot of unnecessary memmove and return, see assembler listing (x64 build with /O3) (WITH /assume:nostd_value and /O3 the call to "test" is optimized away)

subroutine test()
use source
    call heap_test(largearray(452),largearray(4532),largearray(4152),largearray(4552),largearray(4582),largearray(45))
end
subroutine test()
00007FF6E1AD1040  mov         eax,124FA8h  
00007FF6E1AD1045  call        __chkstk (07FF6E1AD25A0h)  
00007FF6E1AD104A  sub         rsp,124FA8h  
use source
    call heap_test(largearray(452),largearray(4532),largearray(4152),largearray(4552),largearray(4582),largearray(45))
00007FF6E1AD1051  lea         rdx,[SOURCE_mp_LARGEARRAY+70Ch (07FF6E1AD770Ch)]  
00007FF6E1AD1058  mov         r8d,30D40h  
00007FF6E1AD105E  lea         rcx,[rsp+20h]  
00007FF6E1AD1063  call        memmove (07FF6E1AD3430h)  
00007FF6E1AD1068  lea         rdx,[SOURCE_mp_LARGEARRAY+46CCh (07FF6E1ADB6CCh)]  
00007FF6E1AD106F  lea         rcx,[rsp+30D60h]  
00007FF6E1AD1077  mov         r8d,30D40h  
00007FF6E1AD107D  call        memmove (07FF6E1AD3430h)  
00007FF6E1AD1082  lea         rdx,[SOURCE_mp_LARGEARRAY+40DCh (07FF6E1ADB0DCh)]  
00007FF6E1AD1089  lea         rcx,[rsp+61AA0h]  
00007FF6E1AD1091  mov         r8d,30D40h  
00007FF6E1AD1097  call        memmove (07FF6E1AD3430h)  
00007FF6E1AD109C  lea         rdx,[SOURCE_mp_LARGEARRAY+471Ch (07FF6E1ADB71Ch)]  
00007FF6E1AD10A3  lea         rcx,[rsp+927E0h]  
00007FF6E1AD10AB  mov         r8d,30D40h  
00007FF6E1AD10B1  call        memmove (07FF6E1AD3430h)  
00007FF6E1AD10B6  lea         rdx,[SOURCE_mp_LARGEARRAY+4794h (07FF6E1ADB794h)]  
00007FF6E1AD10BD  lea         rcx,[rsp+0C3520h]  
00007FF6E1AD10C5  mov         r8d,30D40h  
00007FF6E1AD10CB  call        memmove (07FF6E1AD3430h)  
00007FF6E1AD10D0  lea         rdx,[SOURCE_mp_LARGEARRAY+0B0h (07FF6E1AD70B0h)]  
00007FF6E1AD10D7  lea         rcx,[rsp+0F4260h]  
00007FF6E1AD10DF  mov         r8d,30D40h  
00007FF6E1AD10E5  call        memmove (07FF6E1AD3430h)  
end
00007FF6E1AD10EA  add         rsp,124FA8h  
00007FF6E1AD10F1  ret  

regards

Tobias

0 Kudos
Reply