Software Tuning, Performance Optimization & Platform Monitoring
Discussion around monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform monitoring

Runtime models based on L1-TLB metrics

HodB
Employee
149 Views

We are building a model to predict the CPU runtime of a Haswell Intel Xeon E5-x600 v3 processor using the `perf` tool from the `linux-tools` package and running various benchmarks. The model uses TLB events, including dTLB and sTLB, as predictors. However, the results we obtained were unexpected, as a single-feature model over the same benchmark (and several others) showed a trend opposite to what we anticipated:

HodB_0-1672233444836.png

 

 We would expect that higher number of hits will lead to decrease in cpu cycles.

 Upon further investigation, we realized that we had not considered the out-of-order execution (OOOE) characteristics of the Haswell processor, which includes both retired uops and speculative events. As our goal is to compare "apples to apples," we want to ensure that we are comparing either retired events to retired events or speculative events to speculative events. We are now seeking to include speculative dTLB accesses and retired sTLB hits in our model to achieve proper predictors.

 speculative dTLB accesses attempt: `L1-dcache-loads` and `L1-dcache-stores` counters as potential predictors for speculative dTLB accesses, but we later learned that these counters refer to retired events rather than speculative ones.

retired sTLB hits: We have been unable to identify a suitable counter for retired sTLB hits out of `perf –list` options.

Perhaps there is another way to calculate those features indirectly using other counters we hadn't considered

 

0 Kudos
0 Replies
Reply