I want to know if it is possible to correctly access the PEBS counters via the "perf" utility in Linux? For example, suppose on nehalem I want to collect the number of MEM_LOAD_RETIRED.L1D_HIT events (Code=0xcb, UMask=0x01) during running my program "prog" (and a few other similar events). When I run the perf as follows, I get some numbers (sometimes "scaled" if I try to get a few event numbers):
sudo perf stat -e r01cb ./prog
Are the numbers I get reasonably correct? I ask this because I am not sure I clearly understand the PEBS events.
PEBS (Precise Event Based Sampling) is a feature available to a subset of events which allows the hardware to collect additionalinformation very close to the exact time the configured event overflowed. This presents theanalysis tools whith susbstantially more accurate information since the alternative is to wait for a software interrupt to collect this information, typically hundreds of cycles later.The additional collected information are stored in a special PEBS buffer and retrieved by the tool (in this case perf_events) later.
To use with perf you need to append ":pp" to the event coding. so it would look like this:
sudo perf stat -e r01cb:pp ./prog
The results will typicallybe more preciseif you are able to trace to a specific line of code. so perf stat wouldn't utilize this precision, but perf record would.
Please let me know if you need additional details. Be sure to reference the kernel version and specific cpu model number you are using (cat /proc/cpuinfo)
[System: Intel Core i7, sandy bridge, cpu family 6, model 42; linux kernel 3.0.0-24-generic-pae / Ubuntu 11.10; perf version 3.0.38]
For PEBS events, even without the ":pp" or ":p" suffix, perf stat will give some numbers. I noted that for some cases I tried, I did not see much differences between using ":pp" and not using. Still do you recommend to always use ":pp" to be safe?
Here is an example:
%> sudo perf stat -e r00c0,r01c0,r01c0:p,r01c0:pp ./prog 2
N= 100000000 : NumThreads= 2 : Time= 439.044 msec
Performance counter stats for './prog 2':
10,707,296,495 r00c0 [49.96%]
10,671,452,933 r01c0 [50.13%]
10,676,277,604 r01c0:p [25.11%]
10,704,588,878 r01c0:pp [25.01%]
2.192624454 seconds time elapsed
The PEBS and non PEBS version of events will both producevery similar counts ofthe number of cycles (or any other event).
The added accuracy from PEBS applies to identifying the active "Instruction Pointer" which can be traced to a specific line of code. This is typically collected with 'perf record'.
Is Linux Perf, PEBS events haven't been patched. For example, I want to log MEM_UOPS_RETIRED.L2_HIT_LOADS_PS, but I see MEM_UOPS_RETIRED.L2_HIT_LOADS being patched for KNL.
53 "PEBS": "1", 54 "EventCode": "0x04", 55 "Counter": "0,1", 56 "UMask": "0x2", 57 "EventName": "MEM_UOPS_RETIRED.L2_HIT_LOADS", 58 "SampleAfterValue": "200003", 59 "BriefDescription": "Counts the number of load micro-ops retired that hit in the L2", 60 "Data_LA": "1"
In that case possible to share event codes for these for KNL?
Also, why above code has "PEBS " as 1?