1 rendering (dummy VS, complex GS [using PrimitiveID as input] reading the previous quad, simple PS rendering the generated triangles)
I was surprised that each time I benchmark the code, GPA reported noticable timings differences.
So I recorded a frame, and simply load it many times in GPA (always the same capture file)
It seems that GPA gives 3 sets of values :
GPU : 388 ( PS = 377 )
GPU : 345 ( PS = 335 )
GPU : 295 ( PS = 287 )
GPU : 12122 ( GS = 7960 / PS = 4080 )
GPU : 10777 ( GS = 7070 / PS = 3626 )
GPU : 9195 ( GS = 6034 / PS = 3094 )
trying 10 reloads of the same capture file always switch to one of this case (with some decimal diff, but globally always in this range of 3 cases)
I understand for the small variation, it's logical (and not a problem anyway)
but the big variations may be up to 30%, with is quite annoying (when you're trying to benchmark, it's a bit worrying to have to test several times the same thing to figure a medium value).
Moreover, I use a FPS counter in my application (very basic, giving the time spend for each frame). It has some small variations, but let's say far less than 5% (I never see a static scene taking 30 ms to render, then 40 ms the next frame)
My test (at home) computer is a Sandy bridge (i3-2330M) laptop.
Maybe it's some kind of power/frequency control issue ?
[in this case would it be possible to ensure that while running GPA analysis the GPU runs at full speed]
Thanks for pointing this out... let me see what I can do to help with this.
First of all, what version of Intel GPA are you using? Please be sure that you are using the R4 release (came out about a week ago), as we have made some changes in the measurement code (but I'm not sure that your specific testcase would see any changes, but let's be sure).
Secondly, Intel GPA already makes multiple passes when it calculates the metrics values (this is why you'll see the "variance", so be sure to set "show metrics values range"). So hopefully we already help with this, so as you mentioned a large variance seems unusual.
Thirdly, let me check on the GPU frequency variance due to CPU/GPU power tradeoffs -- we've had some discussion on this over time, and I want to be sure I have the latest information for you.
I've done some experiments myself with some ergs on a sample capture file, and I can't duplicate the variation that you're seeing. Are the other metrics showing similar variation? For example, if any of the memory metrics are showing significant variation this might explain some of the difference (that is, depends upon how you're hitting the cache, etc.).
Also, can you provide some information on how you got your numbers? That is, did you just select/deselect the ergs within the same run of Intel GPA, or did you completely quite out of the Intel GPA application and rerun it again?
If you can copy the "About..." configuration info here and the frame capture file I will have the development team take a look at it.
ps-> If you want, I can send you a private email to use for getting me a copy of the file, or we can try the Intel ftp server (where only Intel personnel can read the file).