Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)

L2 cache Misses

Rishi_Kapoor
Beginner
458 Views
Hi,
I am a new user to intel vtune s/w. I wanted to know what is the difference b/w these 2 metrics of vtune..
"2nd Level Cache Read Misses samples" & "2nd Level Cache Load Misses Retired"
And which one is indicative of the L2 cache miss ?

Thanks
0 Kudos
4 Replies
Vladimir_T_Intel
Moderator
458 Views
Quoting - Rishi Kapoor
Hi,
I am a new user to intel vtune s/w. I wanted to know what is the difference b/w these 2 metrics of vtune..
"2nd Level Cache Read Misses samples" & "2nd Level Cache Load Misses Retired"
And which one is indicative of the L2 cache miss ?

Thanks

Would you specify which micro architecture you are talking about, and which exactly events (names)you are considering?
0 Kudos
Rishi_Kapoor
Beginner
458 Views

Would you specify which micro architecture you are talking about, and which exactly events (names)you are considering?
I am running vtune on dual core intel xeon Nocona (x86_64) with simd technology and ht enabled.
It has unified L2 cache.I am using Vtune linux version(9.1) to observe a single threaded process.

Events I am considering are:
2nd Level Cache Read Misses
2nd Level Cache Load Misses Retired





0 Kudos
Rishi_Kapoor
Beginner
458 Views
My main goal for above events is to measure the data misses (and not the instruction misses).
I think L2 cache read misses will include the instruction misses along with data misses.

0 Kudos
Vladimir_T_Intel
Moderator
458 Views

Ok, this is P4 microarchitecture. Both events indicate L2 cache mises, but different portions.

2nd-level Cache Load Misses Retired is measured inside cache unit and counts the number of retired instructions that attempted to load data from the 2nd-level cache (with no success). This is not a complete count as there might be another reasons for L2 cache misses.

2nd-level Cache Read Misses is measured on the Bus and counts memory load misses and read-for-ownership misses. Look like it counts the instruction misses as well.

If the L2 is the last level cache in the system (which is the case for Xeon Nocona), the L2 cache misses penalty can be calculated as L2 Cache Read Misses * 150 clocks penalty (rough estimation)


0 Kudos
Reply