Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

PMU Results Does not Match



Recently, I try to measure the memory bandwidth of each individual process, but the perf give me quite different results of two events:

       16837103171      OFFCORE_RESPONSE_0.L3_MISS_LOCAL                                   

        6183135358      LLC_MISSES 

CPU : Intel(R) Core(TM) i7-8700K CPU coffe lake

I feel like these two events should describe the same thing, correct me if I am wrong. And LLC_MISSES seems close to the value of uncore_imc/data_reads/

BTW, if anyone has some ideas about how to accurately measure the read/write bandwidth, please tell me , I really struggle for a while


0 Kudos
1 Reply
New Contributor III

The event uncore_imc/data_reads/ counts the number of reads from main memory for requests of any type. The count is summed up over all memory channels. No matter what the request type is, each read operation counted reads 64-byte of data from memory. You can multiply a differential value of this counter by 64 and divide by the period of time to obtain memory read bandwdith in bytes per time unit.

It's not clear to me what the events LLC_MISSES and OFFCORE_RESPONSE_0.L3_MISS_LOCAL are. The Linux perf tool doesn't have events with these exact names. Show the exact pref command you're using. I suspect you're using raw events. The exact event codes are crucial.

OFFCORE_RESPONSE may overcount on Coffee Lake, but this may not fully explain the discrepancy. OFFCORE_REQUESTS (event B0H) is reliable on Coffee Lake. I don't know what you mean by LLC_MISSES, but assuming that it's an accurate event, it should be possible to add up one or more subevents of OFFCORE_REQUESTS to match LLC_MISSES.

Also show the uncore_imc/data_reads/ event count.

0 Kudos