- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello guys!
I'm using Oprofile and also Perf to profile some benchmarks, I'm looking specifically for caching issues. I'm with the Intel SDM Volume 3 (from March 2013) as my guide for choosing what events to monitor... however it's being a pain..
The computer I'm doing the experiments is a i7 3630QM (that is, Ivy Bridge), so in the manual I'm looking in tables 19-1 and 19-5, the problem is: which events should I use to measure L1{D,I} cache events? What about L3 (LLC)? Sincerely, the events description of table 19-5 are more vague than the habitual.
Can anyone help on this?
César.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.
In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.
In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.
Pat
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Patrick, thanks for your (both) answers.
I did what you suggested and although some of VTune's analysis aren't working in my machine (eg: Sandy/Ivy Bridge -> Memory Access ---> "Error: ... not aplicable to current machine microarchitecture") I've collected some event names that I guess one (or a combination) of them achieve what I want (measure L1, L2 and L3 cache hit/miss in Ivy Bridge), however I've some questions:
1) Can I use *only* these two events (below) to account for all stalls caused by L1D / L2 ?
CYCLE_ACTIVITY.STALLS_L1D_PENDING
CYCLE_ACTIVITY.STALLS_L2_PENDING
2) The description of the following event I could not understand. What is a unknown data source?
"MEM_LOAD_UOPS_RETIRED.LLC_MISS_PS --> Miss in last-level (L3) cache. Excludes Unknown data-source."
Thank you again for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I found a pretty good explanation on how (and why this way) to measure L1, L2 and L3 "misses" on Ivy Bridge. The text is subsection B.3.2.3 - Memory Bound Characterization, of the Optimization Reference Manual (version July 2013).
However, I've some questions about the equations shown in this subsection. They account the percentage of *CYCLES* due to "misses" in several levels of the cache hierarchy, right? Should not these equations use CYCLE_ACTIVITY.CYCLES_LDM_PENDING instead of CYCLE_ACTIVITY.STALLS_LDM_PENDING ?
I'm looking forward to your comments.
Thanks,
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page