I have a routine that runs fine in release mode, but crashes after completing in debug mode with a stack corruption. I can see the stack get corrupted (the call stack becomes messed up in Visual Studio) when it executes the qword ... rdi below. I've set /STACK:2000000000 for linking and that doesn't help. MREC is an integer being passed in. Compile flags below too.
Any help appreciated.
00007FF765F5BE68 push rbp
00007FF765F5BE69 mov eax,0D2940h
00007FF765F5BE6E call __chkstk (07FF76661B4E0h)
00007FF765F5BE73 sub rsp,0D2940h
00007FF765F5BE7A lea rbp,[rsp+70h]
00007FF765F5BE7F mov qword ptr [rsp],rax
00007FF765F5BE83 mov rax,0D293Ch
00007FF765F5BE8A mov dword ptr [rsp+rax],0CCCCCCCCh
00007FF765F5BE91 sub rax,4
00007FF765F5BE95 cmp rax,4
00007FF765F5BE99 jg RESTAR_OLD+22h (07FF765F5BE8Ah)
00007FF765F5BE9B mov rax,qword ptr [rsp]
00007FF765F5BE9F mov dword ptr [rsp],0CCCCCCCCh
00007FF765F5BEA6 mov dword ptr [rsp+4],0CCCCCCCCh
00007FF765F5BEAE mov qword ptr [rbp+0D28C0h],rdi
00007FF765F5BEB5 mov qword ptr [rbp+0D28B8h],rsi
00007FF765F5BEBC mov qword ptr [rbp+0D28B0h],rbx
00007FF765F5BEC3 mov qword ptr [MREC],rcx
00007FF765F5BECA mov byte ptr [rbp+0D0666h],0
00007FF765F5BED1 mov byte ptr [rbp+0D0667h],0
/nologo /debug:full /Od /heap-arrays0 /I"C:\sf60\proces\..\TempWorkspace16\x64\procesCur16\Debug" /I"C:\sf60\util\..\TempWorkspace16\x64\utilityCur_16\Debug" /I"C:\sf60\SamgDll\SamgDll\x64\Debug" /recursive /extend_source:132 /Qopenmp /warn:truncated_source /warn:interfaces /integer_size:64 /real_size:64 /assume:byterecl /Qinit:zero /fpe:0 /iface:cref /iface:mixed_str_len_arg /module:"..\..\x64\Current" /object:"work/" /Fd"work\astap.pdb" /traceback /check:none /libs:dll /threads /dbglibs /c
There's no way to provide you a useful answer based on this snippet of instruction sequences. My experience is that stack corruption issues have a cause far removed from where the corruption appears. At least you don't have STDCALL vs. C calling mechanisms to deal with on x64.
You have used a large number of non-default compiler options, so it is no simple matter to work out in your mind what the generated instruction codes ought to be. Furthermore, the ability of an IDE such as Visual Studio (and debuggers, in general) to display the call chain can be adversely affected by code generation optimizations. Therefore, you should not conflate a corrupted view of the call chain ("stack") with "stack corruption".
I see no problem with the mov instruction that you flagged. The memory addressed in that instruction is just the second QWORD that was initialized to 0CCCCCCCCH in a loop earlier in the code.
I find the presence of "qword ptr [MREC]" puzzling. If MREC is a simple scalar argument, why should it be saved to memory? In such circumstances, no explanations may be reasonably expected unless you have presented enough of the source code (plus data, how-to instructions, etc.) for someone else to reproduce the claimed problem.
@mecej4, I agree that there should be no problem with mov qword ptr [rbp+0D28C0h],rdi. But one of the first things that tends to happen in a Windows x64 procedure is to save the register-passed arguments in the parameter save area just above the return address in the stack. Thus probably [MREC] = [rsp+0D2950h] = [rbp+0D28E0h], the save area address for the first register argument. This frees up rcx for other uses.
EDIT: Put rcx at the wrong end of the parameter save area. See https://msdn.microsoft.com/en-us/library/ew5tede7.aspx . Now fixed.
RO, thanks, now I see. Normally, when I see a symbol as the entire r/m, I think of that symbol as a fixed address of a variable (in the .data section). If the symbol is a macro defined to be an expression such as rsp+0D2950H, it would help if that definition were shown as part of any listing where it is used.
I'm not much good at reading the assembler code anymore, but it would appear that CCCCCCCC words are being used here which seems like a lot. During the loop it would appear that the corruption occurs, I just don't know why.I don't understand where that size is coming from. THere is only one argument so it can't be storage for anything in the argument list. Maybe something in the routine, but that seems unlikely as well. Happy to provide more source if it would help. I was hoping someone that know assembler better than I would see what manipulatons were being made on the stack here and maybe why.
For some reason your subroutine is reserving 862528 bytes on the stack. It's not overwriting 0CCCCCCCCh words but rather overwriting those 862528 bytes each with the value 0CCh. No corruption should be occurring while doing this because the current instance of subroutine restar_old owns this memory below the address rsp pointed to at procedure entry, along with the parameter save area. Now, if a previous procedure pointed a pointer at an unsaved local variable or internal procedure and returned the pointer then that pointer would actually have undefined association status and might still point to something on the live stack that looks useful until the stack gets overwritten. In this case, as Steve said, the error happened when the [speculated] procedure returned a dead pointer walking, not in subroutine restar_old. It can be difficult to hunt down an error like this.
In Debug mode, the new stack reserve area (for subroutine local data) is initialized with 0CCh as an indicator that the called subroutine had not written to those locations. In Release mode this initialization code is not present. During subroutine/function execution, these stack reserved areas may/will get overwritten with valid data (which may or may not contain 0CCh in the bytes). The symptom of a program behaving differently (e.g. crash or "wrong"/different results) is indicative of use of an uninitialized variable. IOW in Debug mode, 0CCh, 0CCCCh, 0CCCCCCCCh is being used, whereas in Release mode, whatever leftover content was found at those stack locations was used (different content == different result/behavior). Use the runtime diagnostic checks for uninitialized data. This can catch most of the uninitialized variable usages. It might not catch an uninitialized variable passed by call as argument with intention of returning a value (but not initialized) then subsequently used by the caller. This though can be caught at compile time if you properly attribute the dummy argument with INTENT(OUT) in the called routine.
The Debug code symbol will include the address and/or rsp offset for variables. The Debugger is not smart enough to disambiguate an arbitrary value from that which resolves to the same address and/or rsp offset for variables.