- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The documentation on this is a little difficult to understand. So, from the Intel 64 and ia32 Architecture Developer's Manual (Vol. 3B), there are a number of PMC's that I can use to monitor L3 cache. Two of them are interesting but I wanted to make sure what they were doing. (This is from section 19.2)
B0H 10H OFFCORE_REQUESTS.L3_MISS_ DEMAND_DATA_RD Demand data read requests that missed L3
2EH 41H LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the L3 cache.
Is the difference in these two that the 1st counts only offcore (which i guess means other cores than the polling one) and the other gives a cummulative? Or is there something that I'm missing?
Thanks!
- Tags:
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have not tested these events on Skylake, but if the definitions are similar to earlier processors, the LONGEST_LAT_CACHE.MISS event will count demand loads that miss the LLC and demand stores that miss the LLC. The OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD will only count demand loads that miss the LLC. Neither event will count L2 hardware prefetches that miss the LLC, so neither event is useful for determining the actual data traffic. They are intended to help identify accesses that are *not* prefetched, since these are more likely to cause stalls.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Oh alright. Thank you so much for the clarification!
Adil
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page