GPA 4.2: Issues using GPA with Hybrid configuration (HD Graphics + NVidia 540)
Hi. I'm trying to use GPA on my laptop (Asus N73S). Using NVidia optimus control panel, I force the usage of the Intel GPU both for my DX10 code and GPA (just to be sure). It runs, but I get basic metrics in GPA (only GPU duration, 0 in VS,GS,PS duration), not the detailed ones. I have the latest drivers & the latest GPA versions.
By the way (I'm using GPA at work all day long, it's a very useful tool), would it be possible to : - be able to export RenderTarget as DDS files (I often use RGBA16, RGBA32 RenderTarget which cannot be saved in non-DDS formats) - display the value of the pixels (of the RT) in a format relevant the the RT format (float of a RGBA32F for example) - display the content of the Pre-Transform and Post-Transform buffers (like in PIX)
First of all, let's check on the issue with not getting the metrics you expect. Can you right-click on the GPA Monitor icon in the notification area and copy the "About..." info here? This will show me the graphics device(s) on your system and the rest of your configuration, which will help figure out what to do next.
Hi. Here is the output of the About (GPA): [thanks for your help)
Windows 7, 64-bit DEP enabled Num Processors: 8 Memory: 8102MB System BIOS: _ASUS_ - 6222004 (04/14/11) Video BIOS: Hardware Version 0.0 (12/10/20) Driver 0: Device: Intel HD Graphics Family Provider: Intel Corporation Date: 8-31-2011 Version: 184.108.40.2069 VendorId: 8086 ProductId: 116 (Intel HD 3000 Graphics) Stepping: 9 Supports GPA Instrumentation GPA install directory: C:\Program Files\Intel\GPA v4.2\ GPA version: 4.2.156824 Current user is in Administrators group: YES
In some cases I have seen issues with hybrid systems, but usually they can be fixed by disabling the discrete graphics card in the BIOS. In other words, I'm not sure that using the NVidia optimus control panel is sufficient.
Therefore, can you see whether the BIOS has a setting to disable NVidia graphics, then let me know what happens? Also, another option is to disable the NVidia card from the Device Manager Control Panel.
Also, please include a screen snapshot of the metrics available on your system -- open the GPA Monitor icon in the notification tray, then select "Profiles..." and the "HUD Metrics" tab (scroll to see if there's a heading labelled "Execution Units").
I've attached a screen shot from another SandyBridge system showing the available metrics (I've scrolled to the bottom half of the list):
Hi (sorry for the delay, I spend the week traveling to Montreal) thank you, it works disabling the NVidia GPU in the device manager :)
do you know if there is a "wish-box" somewhere to be able to ask some (basic) features to improve your tools ?
- be able to export RenderTarget as DDS files (I often use RGBA16, RGBA32 RenderTarget which cannot be saved in non-DDS formats) - display the value of the pixels (of the RT) in a format relevant the the RT format (float of a RGBA32F for example) - display the content of the Pre-Transform and Post-Transform buffers (like in PIX) - get an idea of how the code is compiled to have an idea of why in some case it gets very slow (number of GPR used for example, don't know if it's relevant in this architecture) (or a way to submit a shader to a specialist to know why it's so slow. during all the flights I merged some shaders to increase the ALU/TEX ratio, and the merged version of 2 shaders running each in about 1ms now takes more than 20ms, same fetch, but twice the code)
I'm glad that this now appears to resolve your issue -- please let me know if you have any further questions or issues using the tools.
As to your suggestions, I'm going to add them to our list of customer-requested features. By the way, you aren't the first person to request some of these features, so having this feedback is very valuable to helping us prioritize what features to add next.
Hi. Yes it works fine now, big thanks for your help :)
Another thing would be useful (at least for me). I'm trying to port all this game www.youtube.com/watch?v=gJeUz5N3RpQ to GPU, not only the rendering, but also all the simulation compute code. I made several versions of the simulation code, performing computing on a 512x512 grid using PixelShader code. It seems that on your hardware (Sandy bridge HD Graphics), there is a huge performance penalty when using more than 32 GPRs (I have 4 versions of the code, one grouping computations by blocks of 2x2 pixels, one by blocks of 4x4 pixels, and so ... they all do the same things with more or less grouping, this grouping resulting in less texture fetches, which is very useful on some hardware) and the versions using more than 32 GPRs are veeeeeeeery slower than the others (it's not the case on other hardware such as Radeon 5x00 GPU for example) so ... would it be possible to display a counter (if it's available from the hardware) of the number of the running/runnable "threads", to be aware of "not-hardware-friendly" shader design ?