Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

PMU Events for Ivy Bridge

Divino_C_
New Contributor I
417 Views

Hello guys!

I'm using Oprofile and also Perf to profile some benchmarks, I'm looking specifically for caching issues. I'm with the Intel SDM Volume 3 (from March 2013) as my guide for choosing what events to monitor... however it's being a pain..

The computer I'm doing the experiments is a i7 3630QM (that is, Ivy Bridge), so in the manual I'm looking in tables 19-1 and 19-5, the problem is: which events should I use to measure L1{D,I} cache events? What about L3 (LLC)? Sincerely, the events description of table 19-5 are more vague than the habitual.

Can anyone help on this?
César.

0 Kudos
4 Replies
Patrick_F_Intel1
Employee
417 Views

Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.

In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.

Pat

 

0 Kudos
Patrick_F_Intel1
Employee
417 Views

Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.

In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.

Pat

 

0 Kudos
Divino_C_
New Contributor I
417 Views

Hi Patrick, thanks for your (both) answers.

I did what you suggested and although some of VTune's analysis aren't working in my machine (eg: Sandy/Ivy Bridge -> Memory Access ---> "Error: ... not aplicable to current machine microarchitecture") I've collected some event names that I guess one (or a combination) of them achieve what I want (measure L1, L2 and L3 cache hit/miss in Ivy Bridge), however I've some questions:

1) Can I use *only* these two events (below) to account for all stalls caused by L1D / L2 ?

CYCLE_ACTIVITY.STALLS_L1D_PENDING  
CYCLE_ACTIVITY.STALLS_L2_PENDING 

2) The description of the following event I could not understand. What is a unknown data source? 

"MEM_LOAD_UOPS_RETIRED.LLC_MISS_PS  --> Miss in last-level (L3) cache. Excludes Unknown data-source."


Thank you again for your help.

0 Kudos
Divino_C_
New Contributor I
417 Views

I found a pretty good explanation on how (and why this way) to measure L1, L2 and L3 "misses" on Ivy Bridge. The text is subsection B.3.2.3 - Memory Bound Characterization, of the Optimization Reference Manual (version July 2013).

However, I've some questions about the equations shown in this subsection. They account the percentage of *CYCLES* due to "misses" in several levels of the cache hierarchy, right? Should not these equations use CYCLE_ACTIVITY.CYCLES_LDM_PENDING instead of CYCLE_ACTIVITY.STALLS_LDM_PENDING ?

I'm looking forward to your comments.
Thanks, 

0 Kudos
Reply