Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

Which counter should I use to measure L1I-cache misses on Skylake/SkylakeX platform?

Denis_B_Intel
Employee
2,075 Views

I found that perf measure ICACHE_64B.IFTAG_MISS counter for L1-icache-load-misses pre-defined event. See this thread: https://www.spinics.net/lists/linux-perf-users/msg06381.html
The thing here is that when I run the same workload on different platforms, say, Skylake and Haswell I have results that differ by an order of magnitude:
Skylake:
$ perf stat -e L1-icache-load-misses ./a.out
           3291090      L1-icache-load-misses         # measured based on ICACHE_64B.IFTAG_MISS
Haswell:
$ perf stat -e L1-icache-load-misses ./a.out
            521119      L1-icache-load-misses         # measured based on ICACHE.MISSES

This doesn't look like an improvement between the architectures. On Skylake we have FRONTEND_RETIRED.L1I_MISS which supports PEBS and gives a number closer to Haswell (in fact it's lower: 341626).

Any comments?

 

0 Kudos
2 Replies
Dmitry_R_Intel1
Employee
2,075 Views

The documentation for FRONTEND_RETIRED.L1I_MISS event explicitly says that this event counts retired instructions. While ICACHE_64B.IFTAG_MISS documentation doesn't specify this so probably it counts all fetches. This can be one of the reasons for the difference you saw.

In VTune we are using FRONTEND_RETIRED.L1I_MISS event for SKL.

0 Kudos
YeeHaaw
Beginner
1,958 Views
I think for Skylake, it says L1-icache-load-misses # measured based on ICACHE_64B.IFTAG_MISS
But for Haswell, it says L1-icache-load-misses # measured based on ICACHE.MISSES
so maybe it means the no i-cache (L1 i-cache, L2 cache, L3 cache) can serve it? That is what ICACHE.MISSES should mean
0 Kudos
Reply