Hi, In the function getCyclesLostDueL3CacheMisses() defined in cpucounters.h of the PCM package, I see a 180. * L3_cycles/total_cycles computation as return value. Can someone please explain why there is a 180 there, and no a 100? Thanks, Pradeep.
This was an average memory access latency for a 2-socket system. Since this is only a rough estimation method we have deprecated this function and the L3CLK/L2CLK metrics in the upcoming PCM version.
Thanks for the clarification. If you're removing the L3clk/L2clk metrics, is there an alternate route to estimate the amount of time the cores spent waiting on an L3 miss (which would mostly be waiting on DDR, assuming good threading)?
I could recommend the top-down method implemented in the "General Exploration" analysis of Intel® VTune™ Amplifier XE. It is a much more robust method to analyze CPU stalls (incl L3 cache miss stalls).