I am trying to understand the behaviour of programs by running the hardware counter analysis. I tried running the program with multiple size datasets with a finite set of counters to be measured. But to a great surprise, vtune measures most of the counters and dumps in the log. But for few runs, it only measures very few counters among the same set. It also drops down the runtime of the application to a great extent. Is this behaviour normal or its some kind of a bug?
I have attached the dump of VTune for two dataset size i.e. 250000000 and 249000000 integer (4 byte) entries. There is sudden drop in runtime and also the counter measure.
Please help !
There might be several reasons that you will see a few counters for specific events:
1. The events never occurred during program's running. It could be other reason, for example - if your running paths are different for different sessions, their results are different.
2. Duration (Data Set): the event happened, but not frequent, so there are limited samples. There are two solutions: one is to extend more running time, another is to change SAV value to small to get more samples.
In your case, increase more running time from 0.5s to 1.7s to get more counters of BR_INST_RETIRED.ALL_BRANCHES_PS
3. It might be a VTune bug. However, need a test case to ensure that event occurred but VTune didn't detect.