- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I've been doing a little profiling work on my games engine and have noticed that an inordinate amount of time can be spent inside the NVD3DUM.DLL module, and VTune reports up to 50% of my CPU workload happens inside. For now I have resigned to accept that this is probably the driver handling the preparation and submission of my graphics data to the GPU but that's just a guess, and trying to find information about this module is pretty tough with almost no information on it. There also does not appear to be any VS symbol (PDB) files available for it, so I don't know exactly what it's doing when I drill down into it so that's another shroud of mystery to tackle.
Does anyone have any links or information on this module, what it is doing, how it can be optimized and perhaps how to get more symbol style information on it, at least enough to give me a clue what general area I should be optimizing. If anyone posts something useful, I promise to add it to my daily blog and lavish you with praise for digging out information on what seems to be a black box library that is stealing more than half of my processing cycles!
All I can guess is that NVD3DUM stands for NVIDIA DirectX 3D Unmanaged Driver, but beyond that, it's a mystery module!
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Lee
My last answer #20 was related more to your previous posts where you were talking about the slow performance.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do you have any updates?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lee!
Sorry for late response - do you know that GPU profiling is available right in VTune? Assuming you use recent versions there is an additional option in "Hotspot" and "Advanced Hotspot" analysis - "Analyze GPU Usage". It collects similar data that GPUView collects and will present both CPU and GPU profiling data in the same view so you will be able to recognize if your application CPU or GPU bound.
If you use GPA Frame Analyzer, try GPA Platform Analyzer (versions released in 2014) - it'll show you DMA packets queue and also provide DirectX calls so you would be able to find the reasons of GPU activity.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>Assuming you use recent versions there is an additional option in "Hotspot" and "Advanced Hotspot" analysis - "Analyze GPU Usage>>>
I was under assumption that "Analyze GPU Usage" option is only available for Intel GPUs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
It works for any GPU. For Intel Graphics we'll be able to provide meaningful names for GPU engines while on others there will be just numbers like in GPUView.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your answer.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My current problem is the game I need to test consumes about 1GB of system memory. Runs fine as is, but run through the GPA Monitor and it crashes. I suspect it is eating extra memory to capture the data, and this causes the game to initialize completely and simply runs out of memory (32 bit application). Do you know any GPA tricks to ensure it does NOT consume any extra memory in the process of monitoring the application being targeted? I will continue my experiments, maybe with a smaller game, but ideally I want to performance tune the big stuff!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lee!
When saying "run through the GPA Monitor" - which tool do you use - Frame Analyzer, Platform Analyzer or System Analyzer? Also, which version of GPA do you use?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Before you can use GPA Platform Analyser, you need a .gpa_trace file as the source of the analysis. To create this file I needed to run the Intel(r) GPA Monitor and select Analyze Application... When I do on my 1GB application, the application crashes due to lack of memory. If I run the application normally, it works fine. I am suspecting it's the Intel(r) GPA Monitor is consuming some memory here.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>I am suspecting it's the Intel(r) GPA Monitor is consuming some memory here>>>
Can you check GPA Monitor or rather VTune virtual memory usage with VMmap tool?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I tried VMmap and launched my game application through it, a pretty good tool and shows me every memory allocation by the software! When I launch through the tool the tool crashed, but when I run the game and then attach the tool once loaded, it worked fine. The problem is that if the VMmap tool crashes as it loads, and the GPA Monitor crashes during the load, there is no way I can finish the game loading step and THEN attach the VMmap tool. I have had some success with smaller game levels so will be using that to get to a place where there is no crash, and in the meantime the information about memory allocation will come in pretty handy for spotting large commits of memory that are un-necessary. In fact I just spotted a 125MB reservation made by a DLL I no longer use, so time to investigate! In anyone from VMmap is reading, here is the crash log:
Problem signature:
Problem Event Name: APPCRASH
Application Name: vmmap.exe
Application Version: 3.12.0.0
Application Timestamp: 532a6358
Fault Module Name: vmmap.exe
Fault Module Version: 3.12.0.0
Fault Module Timestamp: 532a6358
Exception Code: 40000015
Exception Offset: 0003f36e
OS Version: 6.1.7601.2.1.0.256.1
Locale ID: 2057
Additional Information 1: 6bc8
Additional Information 2: 6bc8b499bf770c4064979b0a658871fa
Additional Information 3: 7409
Additional Information 4: 74094e6bac091eff0133b989e0f30bd2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>> Exception Code: 40000015>>>
Exception code 0x40000015 stands for "Status Fatal App Exit" and is usually generated during the application shutdown.
Exception Offset: 0003f36e points to the exact location of the instruction which caused the crash.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>In anyone from VMmap is reading, here is the crash log:>>>
Can you locate VMmap minidump crash file and upload as a private message?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>>In anyone from VMmap is reading, here is the crash log:>>>
AFAIK Mark Russinovich is a VMmap developer and probably support is given by the Sysinternals forum.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Lee, which version of GPA do you use?
Can you please update to the most recent version available here: https://software.intel.com/en-us/gpa ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2014 R2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
OK, thanks! Can you please try to decrease Trace Duration value?
Right click on icon -> Profiles -> Tracing -> Trace Duration (sec). Set 5, for example.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Some good news. Reducing the trace from 5 seconds down to 1 second might have helped, as I was able to run a smaller level and then use 'Capture Trace' option, and now I have some data in the Platform Analyser which is a step forward. The 'Frame Capture' still showed an 'out of memory' error when I pressed CTRL+SHIFT+C but I dare say I can coax that to work if the level was even smaller, e.t.c. Already found some LOCK commands I could do without to increase performance so onwards and upwards!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I must admitt that GPA output looks more friendly when compared to GPUView. So the reason for poor performance were VertexBuffer lock/unlock commands?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Lee!
Glad that you were able to get useful data with Platform Analyzer. We'll be grateful if you share numbers of performance improvement you could achieve using Platform Analyzer. Any feedback and suggestions for the tool is highly appreciated. Thanks in advance!
As for memory issues - does your application link with /LARGEADDRESSAWARE? This option allows 32bit applications to use up to 4Gb of virtual memory on 64bit systems. Both Platform Analyzer and Frame Analyzer need additional virtual memory in application process.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Awesome advice! I used the flag and I am now able to exceed the 1.8GB system memory cap that caused crashes in my engine. I have not tried the Analysers yet but I have a feel good factor about it now and I wanted to report right away that the LARGEADDRESSAWARE worked a treat ;)
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »