I am new to hardware counter measurements on intel. I want to measure L3 misses to local and remote DRAM on Intel Ivy bridge Model 62(Intel(R) Xeon(R) CPU E7-4860):
I need to understand the difference between OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1 as both have the same sub-fields. Can I use either of these two to measure L3_LOCAL_MISSES and L3_REMOTE_MISSES? I am using PAPI low-level interface to add these two events and measure their counters for a section of my code. I have read/write access to MSR registers. Do I need to set up any lower level bits or anything related to offcore request?
OFFCORE_RESPONSE_0 and OFFCORE_RESPONSE_1 provide identical functionality. The reason that there are two of them is that these events are associated with a separate MSR that is used to program the types of requests/responses that you want to count (instead of being able to include this information in the Umask field of the PERFEVT_SELx MSR). The performance counter event OFFCORE_RESPONSE_0 (Event 0xB7) is associated with MSR 0x1A6, while the performance counter event OFFCORE_RESPONSE_1 (Event 0xBB) is associated with MSR 0x1A7.
So having two events (with different associated MSRs) allows you to count two different offcore response events at the same time.
I don't know much about PAPI. I can tell you something about the events. Yes, you should be able to count L3_LOCAL_MISSES in one of the offcore_response counters and L3_REMOTE_MISSES in the other offcore_response counter. Exactly what papi programs for these 2 events... I don't know.
You can look at https://download.01.org/perfmon/IVB/IvyBridge_matrix_V14.tsv and see the LLC_MISS.LOCAL_DRAM and LLC_MISS.ANY_DRAM events. After you program the appropriate PERFEVTSEL0/1 register, then the MSR_OFFCORE_RSP_0/1 register needs to be programmed. If you are going to use events from this tsv file, then the value in the MATRIX_VALUE column must be programmed into the 'extra' offcore_response MSR (MSR_OFFCORE_RSP_0/1... these are MSR num 0x1a6 and 0x1a7 respectively).