- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm using VTune Amp XE 2015, build 367959 and am running into some problems instrumenting our code with tasks.
When we run the application under VTune, it crashes with a stack overflow in pinvm.dll. Without VTune, it runs fine.
The actual overflow stack has a bunch of tpsstool.dll entries in it. When I attempt to load symbols for C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll (in the Visual Studio 2012 debugger) and point it at the tpsstool.pdb found in C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime, I am told that "A matching symbol file was not found in this folder."
This leads me to believe that the .dll and .pdb aren't from the same build.
shannon
P.S. I don't blame tpsstool for the crash, I just thought you should be notified that the builds do not match. It would be convenient to see what's going on in there while debugging. It would also be nice to have the symbols for pinvm.dll for the same reason.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have two questions:
1. Was it possible that you didn't uninstall prior version completely, and just simply copy new version to cause this...
2. Can you use utility named symchk.exe (from WinDbg package) to check if PDB file is consistent of EXE? For example, "symchk.exe test.exe /v /s path_for_pdb"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
1. As far as I know, this is on a clean install.
2. Results of symchk indicate that tpsstool.pdb might be wanting some other pdbs?
C:\>"C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x86\symchk" "C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll" /v /s "C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb" [SYMCHK] Searching for symbols to C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll in path C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb DBGHELP: Symbol Search Path: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb [SYMCHK] Using search path "C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb" DBGHELP: No header for C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll. Searching for image on disk DBGHELP: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll - OK DBGHELP: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb\amplxe-tpss-collector-tpsstool.pdb - file not found DBGHELP: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb\dll\amplxe-tpss-collector-tpsstool.pdb - file not found DBGHELP: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\runtime\tpsstool.pdb\symbols\dll\amplxe-tpss-collector-tpsstool.pdb - file not found DBGHELP: tpsstool - no symbols loaded [SYMCHK] MODULE64 Info ---------------------- [SYMCHK] Struct size: 1680 bytes [SYMCHK] Base: 0x55000000 [SYMCHK] Image size: 4509696 bytes [SYMCHK] Date: 0x53e154c1 [SYMCHK] Checksum: 0x004247a4 [SYMCHK] NumSyms: 0 [SYMCHK] SymType: SymNone [SYMCHK] ModName: tpsstool [SYMCHK] ImageName: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll [SYMCHK] LoadedImage: C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll [SYMCHK] PDB: "" [SYMCHK] CV: RSDS [SYMCHK] CV DWORD: 0x53445352 [SYMCHK] CV Data: C:\bb\INNLphep2w6r\b\b\tmpec6bkc\build\build_release_win32-x86_icl_13.1_mstools_9.0\tpss.collector.tpsstool\amplxe-tpss-collector-tpsstool.pdb [SYMCHK] PDB Sig: 0 [SYMCHK] PDB7 Sig: {D464B20C-392A-4DB1-B87B-1D8610FAEA4B} [SYMCHK] Age: 1 [SYMCHK] PDB Matched: TRUE [SYMCHK] DBG Matched: TRUE [SYMCHK] Line nubmers: FALSE [SYMCHK] Global syms: FALSE [SYMCHK] Type Info: FALSE [SYMCHK] ------------------------------------ SymbolCheckVersion 0x00000002 Result 0x00010001 DbgFilename tpsstool.dbg DbgTimeDateStamp 0x00000000 DbgSizeOfImage 0x00000000 DbgChecksum 0x00000000 PdbFilename C:\bb\INNLphep2w6r\b\b\tmpec6bkc\build\build_release_win32-x86_icl_13.1_mstools_9.0\tpss.collector.tpsstool\amplxe-tpss-collector-tpsstool.pdb PdbSignature {D464B20C-392A-4DB1-B87B-1D8610FAEA4B} PdbDbiAge 0x00000001 [SYMCHK] [ 0x00000000 - 0x00010001 ] Checked "C:\Program Files (x86)\Intel\VTune Amplifier XE 2015\bin32\tpsstool.dll" SYMCHK: tpsstool.dll FAILED - amplxe-tpss-collector-tpsstool.pdb mismatched or not found SYMCHK: FAILED files = 1 SYMCHK: PASSED + IGNORED files = 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Peter, Shannon - Does the faulting stack actually have pinvm.dll on it? Is the overflow getting reported at the same address every time?
Is there any way you could get the Process Explorer tool from Microsoft* and upload a full memory dump (right-click, full dump)?
Note, this is not related to the question about the symbols. I'm not trying to hijack that, just would like to dig into the overflow issue in pinvm.dll if you are willing.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Addendum to my previous post: The reason I'm asking is because I think you may be misreading the faulting stack. The faulting stack should have pinvm.dll on it. It's possible that if you are trying to debug this, you are looking at the debug break thread and not the overflow thread.
In any case, if the IP address is the same in the error message and the problem is repeatable, it may be a recursion problem. Sometimes these can be difficult to diagnose, so I would love to have a dump of that if possible.
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, the actual crash is in pinvm.dll. The tpsstool.pdb thing was just something I noticed along the way. I wasn't sure that the stack overflow wasn't caused by some multithreading problems of our own (though everything seems fine when not running in VTune), but if recursion issues are not uncommon then I will gladly provide a dump.
It happens consistently during our DirectX 11 startup. I can get a dump for you, but it will likely be pretty large. Is there a good place for me to provide that?
UPDATE: The zipped dump is ~102MB.
shannon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi - Yes, you can upload to me as a private message. Put it in a compressed archive to save space. Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Although it reported that tpsstool.dll mismatched tpsstool.pdb (amplxe-tpss-collector-tpsstool.pdb not found) – I doubt it was due to dll’s dependencies.
The crash happened in pinvm.dll, so symbol mismatching is not a major issue, the developer may need log files to investigate error(s) in ping tool. I think that Bob will tell you how to set environment variables before running VTune(TM) Amplifier, to generate log files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>The actual overflow stack has a bunch of tpsstool.dll entries in it>>>
Can you post the call stack data? Windbg is better suited for this kind of troubleshooting.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>When we run the application under VTune, it crashes with a stack overflow in pinvm.dll. Without VTune, it runs fine>>>
You can set Windbg as default post- mortem debugger with this command: <windbg directory> windbg -I
Do not forget to associate windbg with various dump files <windbg dir> windbg -IA.
You should also enable breaking on exception.
http://msdn.microsoft.com/en-us/library/windows/hardware/ff558822(v=vs.85).aspx
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Shannon - Things are still a little garbled. The exception record is corrupt, but the dump is also unreliable I think because procmon created a 64-bit dump. I was able to dig out the exception code, though, so I at least know that we have a stack allocation failure. You can see how the code is actually in the record link slot.
0:077> dt ntdll!_EXCEPTION_RECORD 29cb5608
+0x000 ExceptionCode : 0n701191700
+0x004 ExceptionFlags : 0x29cb5664
+0x008 ExceptionRecord : 0xc00000fd`00000000 _EXCEPTION_RECORD
+0x010 ExceptionAddress : (null)
+0x018 NumberParameters : 0x54206fb7
+0x020 ExceptionInformation : [15] 0x29cb2000`00000000
0:077> !error c00000fd
Error code: (NTSTATUS) 0xc00000fd (3221225725) - A new guard page for the stack cannot be created.
I have two tools I would like you to use instead: Application Verifier (appverif.exe) -> dirty-stacks," just in case the stack is corrupt, coupled with the procdump tool (for 32-bit dump on 64-bit OS).
Here is ProcDump (just a handy tool to have around anyway):
http://technet.microsoft.com/en-us/sysinternals/dd996900.aspx
Applicaion Verifier should be installed already, but if not, you can also download it here:
http://www.microsoft.com/en-us/download/details.aspx?id=20028
I'm attaching an image to show you where "dirty-stacks" is located. You need to add your application from file menu, then deselect "Basics" and open up the misc node to select "dirty stacks." Do this first before you attempt another repro. Then, use procdump to dump the process. This is going to be a command-line thing, like: procdump notepad.exe.
Then, please upload to me in private as before.
And, WRT Peter's logging, I would like to wait on that until I have a chance verify we have a solid dirty-stacks dump.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
VTune says:
Cannot start analysis because Image File Execution Options (IEFO) are enabled for this application. Suggestion: Use the Global Flags Editor (GFlags) to disable IFEO for this application.
Rather than having VTune launch the app, I launched the app and attached VTune. When I did so, it instantly crashed. Regardless of where I pause the startup it crashes as soon as I attach VTune. So it seems that option and VTune are incompatible.
shannon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Based on my analysis of your last dump, I would like to see if we can rule out an infinite or recursive loop to be sure. I think this would help chances for a more timely workaround or fix, if it's found to be a defect in our product.
I thought a way to do that might be to use editbin to change the default stack size. You can see below that the last allocation hit the limit and the last several 1000h-sized blocks of stack space is all zeros.
If you are amenable to this, both editbin and dumpbin can be found in the Visual Studio* toolset.
EditBin
http://msdn.microsoft.com/en-us/library/35yc2tc3.aspx
DumpBin:
http://msdn.microsoft.com/en-us/library/9ha900t8.aspx
What you could do is check the stack size in the exe using dumpbin /headers
In my test case, they were set to 1000 commit and 100000 reserve like so:
100000 size of stack reserve
1000 size of stack commit
100000 size of heap reserve
1000 size of heap commit
Using the C-notation, I changed it to this:
200000 size of stack reserve
2000 size of stack commit
With this command: editbin /STACK:0x200000,0x2000
I think the problem line is the sub 1000h allocation just above and not the current IP.
0:077> !teb
TEB at ffed1000
ExceptionList: 361aed58
StackBase: 361b0000
StackLimit: 361a1000
0:077> ub eip l10
54206fa4 3bc8 cmp ecx,eax <<<<<----- begin loop
54206fa6 720a jb pinvm!(54206fb2) <<<<<-----
54206fa8 8bc1 mov eax,ecx
54206faa 59 pop ecx
54206fab 94 xchg eax,esp
54206fac 8b00 mov eax,dword ptr [eax]
54206fae 890424 mov dword ptr [esp],eax
54206fb1 c3 ret
54206fb2 2d00100000 sub eax,1000h <<<<<---------- (problem is here, I think, but would like to confirm)
0:077> r
eax=361a2000 ebx=00000000 ecx=361a29f8 edx=00000005 esi=06c49160 edi=00000001
eip=54206fb7 esp=361a5ee8 ebp=00000000 iopl=0 nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
54206fb7 8500 test dword ptr [eax],eax ds:002b:361a2000=00000000
0:077> u
54206fb7 8500 test dword ptr [eax],eax
54206fb9 ebe9 jmp pinvm!(54206fa4) <<<<<----- end loop
0:077> !address esp
Usage: Stack
Allocation Base: 361a0000
Base Address: 361a1000
End Address: 361b0000
Region Size: 0000f000
Type: 00020000 MEM_PRIVATE
State: 00001000 MEM_COMMIT
Protect: 00000004 PAGE_READWRITE
More info: ~77k
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Used editbin as you indicated:
200000 size of stack reserve
2000 size of stack commit
100000 size of heap reserve
1000 size of heap commit
And it crashed in the same code, it seems. If you want the dump I can provide that as well.
0:076> !teb TEB at ffed4000 ExceptionList: 382fe988 StackBase: 38300000 StackLimit: 382f1000 SubSystemTib: 00000000 FiberData: 00001e00 ArbitraryUserPointer: 0067c8b8 Self: ffed4000 EnvironmentPointer: 00000000 ClientId: 00005aac . 000058d4 RpcHandle: 00000000 Tls Storage: 346d6130 PEB Address: fffde000 LastErrorValue: 0 LastStatusValue: c000000d Count Owned Locks: 0 HardErrorMode: 0 0:076> ub eip l10 pinvm!CrtEnableThreadCallbacks+0xb1bf1: 54206f91 8d4c2404 lea ecx,[esp+4] 54206f95 2bc8 sub ecx,eax 54206f97 1bc0 sbb eax,eax 54206f99 f7d0 not eax 54206f9b 23c8 and ecx,eax 54206f9d 8bc4 mov eax,esp 54206f9f 2500f0ffff and eax,0FFFFF000h 54206fa4 3bc8 cmp ecx,eax 54206fa6 720a jb pinvm!CrtEnableThreadCallbacks+0xb1c12 (54206fb2) 54206fa8 8bc1 mov eax,ecx 54206faa 59 pop ecx 54206fab 94 xchg eax,esp 54206fac 8b00 mov eax,dword ptr [eax] 54206fae 890424 mov dword ptr [esp],eax 54206fb1 c3 ret 54206fb2 2d00100000 sub eax,1000h 0:076> u pinvm!CrtEnableThreadCallbacks+0xb1c17: 54206fb7 8500 test dword ptr [eax],eax 54206fb9 ebe9 jmp pinvm!CrtEnableThreadCallbacks+0xb1c04 (54206fa4) 54206fbb 8bff mov edi,edi 54206fbd 55 push ebp 54206fbe 8bec mov ebp,esp 54206fc0 51 push ecx 54206fc1 51 push ecx 54206fc2 57 push edi 0:076> !address esp Mapping file section regions... Mapping module regions... Mapping PEB regions... Mapping TEB and stack regions... SYMSRV: C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\sym\amplxe-tpss-collector-tpsstool.pdb\D464B20C392A4DB1B87B1D8610FAEA4B1\amplxe-tpss-collector-tpsstool.pdb not found SYMSRV: C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\sym\amplxe-tpss-collector-tpsstool.pdb\D464B20C392A4DB1B87B1D8610FAEA4B1\amplxe-tpss-collector-tpsstool.pdb not found SYMSRV: http://msdl.microsoft.com/download/symbols/amplxe-tpss-collector-tpsstool.pdb/D464B20C392A4DB1B87B1D8610FAEA4B1/amplxe-tpss-collector-tpsstool.pdb not found DBGHELP: C:\bb\INNLphep2w6r\b\b\tmpec6bkc\build\build_release_win32-x86_icl_13.1_mstools_9.0\tpss.collector.tpsstool\amplxe-tpss-collector-tpsstool.pdb - file not found *** ERROR: Symbol file could not be found. Defaulted to export symbols for tpsstool.dll - DBGHELP: tpsstool - export symbols DBGHELP: KERNELBASE - public symbols C:\Program Files (x86)\Windows Kits\8.0\Debuggers\x64\sym\wkernelbase.pdb\5ED1648668CE459B9C9C4093002426EF1\wkernelbase.pdb Mapping heap regions... Mapping page heap regions... Mapping other regions... Mapping stack trace database regions... Mapping activation context regions... Usage: Stack Base Address: 382f1000 End Address: 38300000 Region Size: 0000f000 State: 00001000 MEM_COMMIT Protect: 00000004 PAGE_READWRITE Type: 00020000 MEM_PRIVATE Allocation Base: 382f0000 Allocation Protect: 00000004 PAGE_READWRITE More info: ~76k
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Okay, thanks for doing that. It's what I expected, but it's nice to be sure. That is all I need for now.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Bob
What does this ecx=361a29f8 value point to? It seems strange that loop counter is decremented by 1000h or maybe that is some kind of allocation of 4096 bytes per every loop iteration?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, I'm not sure. I'm not really attempting to determine exactly what's happening, only to gather a preponderance of evidence to demonstrate a likely-hood. It seems to be a recursive or iterative 4k stack allocation that's causing the error. We will have to wait and see what the engineering team says.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>It seems to be a recursive or iterative 4k stack allocation that's causing the error>>>
I also suspect iterative stack allocation to be a reason for the stack overflow error.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I guess I would want to caution that we not read too much into this. It's possible we are just seeing the obvious. The error message says that it can't allocate a guard page. The default page size on Windows is 4k. The fact that pin is an injector/interceptor kind of application means that we could just be seeing the system failing to allocate another page or enough pages for something entirely unknown. It may well be reported in the debugger as pin code, or a fault in the pin DLL, but that could be simply because pin has placed itself in the middle of the call chain in order to do what it does.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It could be interesting to see which export/imports are hooked or intercepted by pin tool. I think that it can be a good starting point for the investigation of the problem.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Shannon - It could be a coincidence, but I've discovered another issue with the exact same problem on Windows 7 x64. In both cases, the NVIDIA driver is on the faulting stack and is of the same version. It might be worth upgrading the driver as a possible resolution while we continue to investigate the issue internally. In your case, the base pointer is being used as a general-purpose register, so I can't make much sense of the stack other than to say that there are addresses which fall in the range of the driver.
Here, I believe, would be the latest driver for your system.
http://www.nvidia.com/download/driverResults.aspx/79890/en-us
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page