Profiling memory accesses - which counters overlap with which?
I have some Perl scripts that implement some of the top-level vtune profiling for sandy bridge (and SB Xeon.)
* % cycles spent LLC miss: ~ 53%
* % cycles spent doing DTLB walks: ~ 10%
* % cycles spent accessing data modified by another core: ~ 3.5%
My question: when doing DTLB walks (DTLB_LOAD_MISSES.WALK_DURATION) and cycles spent hit'ing on data modified by another core (MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM) - do these overlap in any way with the LLC_MISS (ie memory access - MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS) ? Partially overlap? Don't overlap at all?
I'm especially interested in whether the DTLB walks count towards MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS.