Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

DirectX11: big stalls in the driver of an Intel HD 3000

DZ1
Beginner
1,677 Views

Hi,

I'm currently working on having an existing DirectX11 application running well on an Intel Graphics HD 3000 based laptop.
As a first draft, I got extremely poor performances. I have read the Sandy Bridge Graphics Developer’s guide, and of course I have planned some already known tweak & optimizations.

But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal event.

Is this the “normal” behavior of the driver (flush of the command buffer, or something else) ?
If not, Is there some known particular features of DirectX11 that could lead to this ? Is there some ways, using Intel GPA for example, to get more precise information’s about what goes wrong ?

Any help, tips or advice is welcome.

 Thanks in advance.

0 Kudos
26 Replies
Bernard
Valued Contributor I
1,389 Views
>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>> Can you post the stalled thread call stack?
0 Kudos
Bernard
Valued Contributor I
1,389 Views
iliyapolak wrote:

>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>>

Can you post the stalled thread call stack?

Another option to try is to run your DirectX app under windbg and check for processor hog.Sadly !runaway command is user mode command and in order to fully inspect and find the culprit you need to perform kernel mode debugging coupled with the user mode debugging.
0 Kudos
DZ1
Beginner
1,389 Views
iliyapolak wrote:

>>>But, when I use the Concurrency Visualizer integrated in Visual Studio 2012, it appears that my rendering thread sometimes stalls quite randomly in the frame for more than 10ms, inside the intel driver according to the sampled callstack, which looks like to wait an internal even>>>

Can you post the stalled thread call stack?

examples of callstack:
ntoskrnl.exe!SwapContext_PatchXRstor+0x103 ntoskrnl.exe!KiSwapContext+0x7a ntoskrnl.exe!KiCommitThreadWait+0x1d2 ntoskrnl.exe!KeWaitForMultipleObjects+0x26a dxgmms1.sys!VidSchWaitForEvents+0x9c dxgmms1.sys!VidSchWaitForCompletionEvent+0x139 dxgmms1.sys!VIDMM_DMA_POOL::WaitDmaBufferNotBusy+0xcc dxgmms1.sys!VIDMM_DMA_POOL::AcquireBuffer+0x2a1 dxgkrnl.sys!DXGCONTEXT::Render+0x263 dxgkrnl.sys!DxgkRender+0x3e7 win32k.sys!NtGdiDdDDIRender+0x12 ntoskrnl.exe!KiSystemServiceCopyEnd+0x13 wow64win.dll!ZwGdiDdDDIRender+0xa wow64win.dll!whNtGdiDdDDIRender+0xf9 wow64.dll!Wow64SystemServiceEx+0xd7 wow64cpu.dll!ServiceNoTurbo+0x2d wow64.dll!RunCpuSimulation+0xa wow64.dll!Wow64LdrpInitialize+0x429 ntdll.dll! ?? ::FNODOBFM::`string'+0x6d07 ntdll.dll!LdrInitializeThunk+0xe gdi32.dll!_NtGdiDdDDIRender@4+0x15 d3d11.dll!NDXGI::CDevice::RenderCB+0x1a9 igd10umd32.dll!0x26bd16 igd10umd32.dll!0x26c3e3 igd10umd32.dll!0x2090fd igd10umd32.dll!0x2111a3 igd10umd32.dll!0x211539 igd10umd32.dll!0x2085a1 igd10umd32.dll!0x2088c2 igd10umd32.dll!0x1fe092
and
ntoskrnl.exe!SwapContext_PatchXRstor+0x103 ntoskrnl.exe!KiSwapContext+0x7a ntoskrnl.exe!KiCommitThreadWait+0x1d2 ntoskrnl.exe!KeWaitForMultipleObjects+0x26a dxgmms1.sys!VidSchWaitForEvents+0x9c dxgmms1.sys!VidSchWaitForCompletionEvent+0x139 dxgmms1.sys!VIDMM_GLOBAL::xWaitOnDMAReferences+0xa2 dxgmms1.sys!VIDMM_GLOBAL::BeginCPUAccess+0x7e3 dxgmms1.sys!VidMmBeginCPUAccess+0x28 dxgkrnl.sys!DXGDEVICE::Lock+0x287 dxgkrnl.sys!DxgkLock+0x22a win32k.sys!NtGdiDdDDILock+0x12 ntoskrnl.exe!KiSystemServiceCopyEnd+0x13 wow64win.dll!ZwGdiDdDDILock+0xa wow64win.dll!whNtGdiDdDDILock+0x76 wow64.dll!Wow64SystemServiceEx+0xd7 wow64cpu.dll!ServiceNoTurbo+0x2d wow64.dll!RunCpuSimulation+0xa wow64.dll!Wow64LdrpInitialize+0x429 ntdll.dll! ?? ::FNODOBFM::`string'+0x6d07 ntdll.dll!LdrInitializeThunk+0xe gdi32.dll!_NtGdiDdDDILock@4+0x15 d3d11.dll!NDXGI::CDevice::LockCB+0x4c igd10umd32.dll!0x2073dc igd10umd32.dll!0x22cc3f igd10umd32.dll!0x20df39 igd10umd32.dll!0x201b24
Thanks for help.
0 Kudos
Bernard
Valued Contributor I
1,389 Views
Obviously your rendering thread entered so called synchronous waiting state and was swapped.The main problem is locate the culprit responsible for the thread's stall.Sadly can not find any information regarding this driver and its functions 'dxgmms1.sys!VIDMM_GLOBAL'. My bet is that this is user mode driver which is responsible for receiving commands from the DirectX runtime. These 3 functions calls are crucial for understanding the cause for the extended wait: dxgmms1.sys!VIDMM_GLOBAL::xWaitOnDMAReferences+0xa2 dxgmms1.sys!VIDMM_GLOBAL::BeginCPUAccess+0x7e3 dxgmms1.sys!VidMmBeginCPUAccess+0x28 While going through the WDK display driver documentation i was not able to find any relevant information regarding dxgmms.sys driver. By looking at the call stack I think that the possible reason could be related to the DMA buffer(s) and if I'm not wrong for allocation of DMA buffers is responsible display miniport driver.Moreover miniport driver needs to lock DMA buffer pages so this can be an issue i.e waiting for the lock to be obtained/released. It is very hard to exactly understand what those functions are doing without the putting a breakpoint on one of them and doing single step through the dissasembly. My advice is to run GpuView.exe utility which will collect statistics about the GPU and CPU performance and maybe you will be able to pinpoint the source of your problem. Another option is to run your app under windbg and issue !runaway 3 command for the tracing processor hog. Another option is to do kernel mode debugging for the stalled process.
0 Kudos
Bernard
Valued Contributor I
1,389 Views
>>>wow64win.dll!ZwGdiDdDDILock+0xa>>> Do you run 32-bit app on 64-bit Windows OS?
0 Kudos
Bernard
Valued Contributor I
1,389 Views
Hi djiz! Do you have a checked build of dxgkrnl.sys?If you do you can enable extended logging feature which will be displayed on the debugger break in. If you do not so you can still log those errors. As I stated earlier in my previous post GpuView.exe is an essential tool to be used in the case of DMA buffer errors.
0 Kudos
DZ1
Beginner
1,389 Views
iliyapolak wrote:

>>>wow64win.dll!ZwGdiDdDDILock+0xa>>>

Do you run 32-bit app on 64-bit Windows OS?

Hi, Thanks for your answers. You're right, I'm running a 32 bit app on a 64-bit Windows 7. Could it be related to the stalls ?
iliyapolak wrote:

Do you have a checked build of dxgkrnl.sys?If you do you can enable extended logging feature which will be displayed on the debugger break in.
If you do not so you can still log those errors.

How can I do that ? For your other suggest, I've tried a little GPUView one year ago but at that time I thought it was to much complex for my needs. Now with the current issue, this is probably more adapted. I'll give it a deeper try.
0 Kudos
Bernard
Valued Contributor I
1,389 Views
>>>Hi, Thanks for your answers. You're right, I'm running a 32 bit app on a 64-bit Windows 7. Could it be related to the stalls ?>>> Hardly to say if this is the reason for the thread stalls.Wow64 is simply hooking ,intercepting and translating your 32-bit system calls.If you were able to run your app on 32-bit Win it could be great we could have been able to eliminate the or to remove the responsibility from the WoW64.dll
0 Kudos
Bernard
Valued Contributor I
1,389 Views

>>>How can I do that ?>>> For this so called checked build of Windows is needed it is not free and if you have somehow possibility to obtain at least checked dxgkrnl.sys it could be very helpful in your case.

P.S

 Checked build Windows is available for MSDN subscribers only.

0 Kudos
Bernard
Valued Contributor I
1,389 Views

>>>For your other suggest, I've tried a little GPUView one year ago but at that time I thought it was to much complex for my needs. Now with the current issue, this is probably more adapted. I'll give it a deeper try.>>>

I strongly advise you to use GPUView tool this program can gather information about the performance of GPU.More obscure alternative is to perform full blown kernel debugging.

0 Kudos
Bernard
Valued Contributor I
1,389 Views

@djiz

Do you have any updates regarding your stalled application?

0 Kudos
DZ1
Beginner
1,389 Views

Hi,

I've spent some time exploring GPUView. It confirms the big stalls in the rendering thread, and it seems that they occur while the thread is in kernel mode (inside the driver).
I didn't get much more information but I've observed that the dma packet containing the present is queued a long time in the hardware & the previous dma packet looks like an huge one. I was intuitively expecting much more small dma packet.

PS: I've serach for "checked build" into MSDN subscriber download but without success.

0 Kudos
Bernard
Valued Contributor I
1,389 Views

Hi dijz!

Checked build Windows is not free in order to download it you must be MSDN subscriber and this will cost you a 700$.For efficient dxgkrnl.sys debugging you need checked build directX kernel driver which is te special version filled with debugger assertions and this functionality will make easier driver debugging by performing debugger break-in on trigerred assertion.

Can GPUView provide you with the DMA related call stack?Can you follow function which calls HAL dma functions.IIRC DMA buffers are managed by HAL.DLL routines.As I stated earlier in my post for DMA allocations is responsible miniport driver the one that is working with your video hardware.I do not know how exactly  Intel miniport driver is accessing DMA functionality.As I told you earlier there is an option to dig deep down those DMA related features this option is to perform kernel debugging on your stalled thread by using !dma instruction.You can also run driver verifier which will help you to test DMA functionality. I strongly advise to run driver verifier and test DMA.Please post the results.

0 Kudos
Bernard
Valued Contributor I
1,389 Views

 You can a lot of information about the driver verifier in this link :http://support.microsoft.com/kb/244617

0 Kudos
Bernard
Valued Contributor I
1,389 Views

Did you test you display driver with driver verifier?

0 Kudos
Bernard
Valued Contributor I
1,389 Views

@dijz

Do you have any updates?

0 Kudos
DZ1
Beginner
1,389 Views

Hi,

Sorry. I was very busy with other things these days, but when I have more time to spend on this, I'll post updates here.
Thanks again. 

0 Kudos
Bernard
Valued Contributor I
1,389 Views

DZ wrote:

Hi,

Sorry. I was very busy with other things these days, but when I have more time to spend on this, I'll post updates here.
Thanks again. 

Ok I will be waiting for any updates.

Btw Are you DirectX developer?

 

0 Kudos
Bernard
Valued Contributor I
1,389 Views

Hi

Any updates on the status of your problem?

0 Kudos
DZ1
Beginner
1,205 Views

Hi,

Unfortunately, still no updates on this because of other high priority tasks. I'm sure you know what I mean.
To answer your question, yes I'm a DirectX developer but not only. More generally a graphic developer (I worked also a lot on consoles, that's why I'm quite frustrated when working on PC with a lot of various "opaque" arhitectures and unknown driver-side or os-side behavior ;)

0 Kudos
Reply