Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Unusual value (extremely large) during monitoring

Zhuoran
Beginner
803 Views

Dear all,

 

I am trying to record the MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS using pcm-core (windows). But the data I record seems unusual as attached.
I tried several processor tests from phoronix. But I always get extremely large value during the test.

The command I used to record the counter value is 

pcm-core 0.001 -csv=test.csv -e cpu/umask=0x02,event=0xD4,name=MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS/

I set the time interval as 1 ms to record the event.

As shown in the test.png, there are two or three time points which have unusual large value. Most data of these time series is around 2000~4000. But the abnormal value is nearly 120000. I am not sure if it is correct or not. There is no other significant program running as it could be noise or a reason to the unusual value. So anybody knows what's the reason of that unusual value?

 

Best regards,

0 Kudos
1 Solution
McCalpinJohn
Honored Contributor III
803 Views

The value is plausible -- 120000 cache lines in 1 millisecond corresponds to 7.68 GB/s, which is achievable by a single core.

But it is still a suspiciously high number.   I would add another performance counter event to measure how much kernel activity is taking place in each interval.  Either cycles or instructions are reasonable, but cycles are perhaps easier to understand. 

Event 0xD4 is only used for MEM_LOAD_UOPS_RETIRED.LLC_MISS on second-generation Intel Core processors and the Intel Xeon E3-1200 processors.  

The Cycles Not Halted event is "architectural", so it should apply to most Intel processors.  I don't know how to set some of these bits by name in pmc-core, but the raw encodings would be 0x0043003c for CPU_CLK_UNHALTED.THREAD_P and 0x0042003c for the same measure, but limited to kernel-mode activities.   The kernel-mode cycles would have to be a large fraction (1/2 or more?) of the total cpu cycles in an interval with high LLC misses for a single kernel thread to generate miss counts this large in 1 millisecond.

View solution in original post

0 Kudos
2 Replies
McCalpinJohn
Honored Contributor III
804 Views

The value is plausible -- 120000 cache lines in 1 millisecond corresponds to 7.68 GB/s, which is achievable by a single core.

But it is still a suspiciously high number.   I would add another performance counter event to measure how much kernel activity is taking place in each interval.  Either cycles or instructions are reasonable, but cycles are perhaps easier to understand. 

Event 0xD4 is only used for MEM_LOAD_UOPS_RETIRED.LLC_MISS on second-generation Intel Core processors and the Intel Xeon E3-1200 processors.  

The Cycles Not Halted event is "architectural", so it should apply to most Intel processors.  I don't know how to set some of these bits by name in pmc-core, but the raw encodings would be 0x0043003c for CPU_CLK_UNHALTED.THREAD_P and 0x0042003c for the same measure, but limited to kernel-mode activities.   The kernel-mode cycles would have to be a large fraction (1/2 or more?) of the total cpu cycles in an interval with high LLC misses for a single kernel thread to generate miss counts this large in 1 millisecond.

0 Kudos
Zhuoran
Beginner
803 Views

Thank you very much Dr. Bandwidth! I just figured out why the unusual value happened. As I set 1 ms as time interval of recording, but the actually recording time interval varies around 1 ms - 3 ms. For the unusual time point, the time interval is 30 ms and that's why it produce extremely large data. It records the data in 30 ms.

Here is a previous post I asked "if we could record data in a 1 ms resolution", it appears that 1 ms resolution is too small for our cpu, but I don't know why sometimes it records such a large time interval (30 ms). 

https://software.intel.com/en-us/comment/1942571#comment-1942571

McCalpin, John (Blackbelt) wrote:

The value is plausible -- 120000 cache lines in 1 millisecond corresponds to 7.68 GB/s, which is achievable by a single core.

But it is still a suspiciously high number.   I would add another performance counter event to measure how much kernel activity is taking place in each interval.  Either cycles or instructions are reasonable, but cycles are perhaps easier to understand. 

Event 0xD4 is only used for MEM_LOAD_UOPS_RETIRED.LLC_MISS on second-generation Intel Core processors and the Intel Xeon E3-1200 processors.  

The Cycles Not Halted event is "architectural", so it should apply to most Intel processors.  I don't know how to set some of these bits by name in pmc-core, but the raw encodings would be 0x0043003c for CPU_CLK_UNHALTED.THREAD_P and 0x0042003c for the same measure, but limited to kernel-mode activities.   The kernel-mode cycles would have to be a large fraction (1/2 or more?) of the total cpu cycles in an interval with high LLC misses for a single kernel thread to generate miss counts this large in 1 millisecond.

0 Kudos
Reply