Hi, I've been trying to use VTune as a tool to measure performance of the software we develop in a computer-independent way.
The idea was to see how many instructions/clockticks our software was using during specific operations, and to be able to compare the results on different software versions without having to set up an identical computer. My idea was that the number of instructions used should be the same on all computers, regardless of CPU speed, and things like RAM and disk speed shouldn't significantly affect the number of instructions the software needed to perform an operation. I may be wrong on this assumption, though...
However, the results differe significantly on the two computers I have tried it on. Computer 1: Celeron 1,7 GHz, 512 MB ram - Average 784 Million instructions retired events. Computer 2: P4 3 GHz, 1 GB ram - Average 465 Million instructions retired events.
My suspicion is that the sampling rate is the cause of the different results. I guess more instructions are executed between the samples on the faster computer. Does anyone has any idea on how to work around this to be able to make a computer-independent performance measurement? Or is VTune perhaps not the right tool for the job?
Did you check that calibration is disabled on both computers, and that the sample-after counts are the same? Can you compile your code with the same option and get reasonable efficiency on both machines?
Thanks! That seems to have solved the problem. After disabling calibration and setting the same "sample after" value, I get practically the same result on widely different computers (1,7 GHz Celeron vs. 2,3 GHz Core Duo).
I'm just evaluating VTune so far, but now it looks like it can do what we want. Thanks again!