Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Counting Offcore_Response Events Xeon Phi KNL

Adam_S_5
Beginner
385 Views

I am having difficulty retrieving offcore response event values on a development Knights Landing unit (CentOS 7).

As an example, in an attempt to measure all events (bits [15] and [16]) I have tried:

perf stat -e cpu/event=0xb7,umask=0x1,offcore_rsp=0x18000/ -a <command>

And the counter returns 0.  Without any luck, I have tried other offcore_rsp codes (using the Xeon Phi PM Reference Manual) and codes from current ocperf.py (which produces slightly different hex codes--one of us is encoding incorrectly).  Other event/umask combinations I am able to read successfully.  

Are the offcore counters enabled on the developer chips?  Any suggestions I could try?  The offcore events seem really useful!

 

Thanks,

Adam

0 Kudos
6 Replies
McCalpinJohn
Honored Contributor III
385 Views

The documentation for the "perf stat" syntax is incorrect.

Here is the script that I use:

#!/bin/bash

# This is just to remind me of the correct syntax to use perf stat
# to access the offcore response counters on KNL
# This example uses both counters to access offcore response events.
# The first counts all transactions to MCDRAM
# The second counts all transactions to DDR4 DRAM

perf stat -e cpu/config=0x004301b7,config1=0x0000003F80608000/ -e cpu/config=0x004302b7,config1=0x0000003F81808000/ $*

 

0 Kudos
Adam_S_5
Beginner
385 Views

Thank you for that!  Now I think I understand the full bit configurations.  Unfortunately, still getting zeros!  I am also unable to read EDC_Hit/Miss counters (running in cache mode).

Running 3.10.0-327.28.2.el7.x86_64 (the latest CentOS 7 I believe), and with commensurate perf version.  I have perf_event_paranoid=-1.  Quadrant mode.

Any other ideas?  

0 Kudos
McCalpinJohn
Honored Contributor III
385 Views

For the offcore response counters it looks like this command only works on the version of the kernel that Intel provided (3.10.0-327.el7.centos.mpsp_1.3.1.45.x86_64).    With the mainline Centos kernel (3.10.0-327.22.2.el7.x86_64) I also get zeros as results.   With a bit more checking, it appears that it is not actually programming the "config1" bits into MSR 0x1a6 (or MSR 0x1a7), while the Intel-provided kernel is programming the auxiliary MSR correctly.    I have been staring at the 3.10.0-327.22.2 kernel source tree for a while, but I can't find anything useful in it.....

For the EDC UCLK counters, you may have the same problem that we have on our systems.  For some reason the OS does not believe that these PCI Configuration Space devices have the full 4096 Bytes of content, so it restricts access to the first 256 Bytes.   (We see this on both the Intel-provided mpsp kernel and for the mainline Centos kernel).  The accesses are not being blocked by the BIOS -- you can still read and write to the extended PCI configuration space area by directly accessing the corresponding addresses (either in the kernel or in a root-privileged user mode program that mmap's PCI configuration space via /dev/mem).  
 

0 Kudos
Adam_S_5
Beginner
385 Views

I never would have figured that out, thanks for testing!  I have no trace of that kernel or anything else related to Intel's MPSP anywhere on my computer, nor can I find anything online about it...currently working through my vendor to try to get a hold of it.

And good to know that the EDC counters are actually a separate issue.  I'll give this a try some time soon and may be back with questions :)

0 Kudos
Daejin_J_
Beginner
385 Views

Hello,

I would like to collect all samples from RxR_Occpuancy.IRQ event counts on KNL. At this points, I want to know what a config means on perf tool. On your example, I think "01=umask, B7=event_code". is that right? but, what is the number "43"? I want to know where I can check this numbers.

Also, could you let me know how to read RxR_Occupancy.IRQ events each CHA? when I checked some manual, I didn't find event code or umask for 38 CHA components, even though I found a related MSR register like "PERF_EVT_SEL_0_CHA_x".

Finally, I want to collect raw data of RxR_Ocuupancy.IRQ for all CHA whenever (100ns ~ 1us) sampling interval.

Do you have any ideas?

Sincerely,

Thank you.

 

0 Kudos
McCalpinJohn
Honored Contributor III
385 Views

In my example

perf stat -e cpu/config=0x004301b7,config1=0x0000003F80608000/ -e cpu/config=0x004302b7,config1=0x0000003F81808000/ $*

The "config" fields are the full raw bit fields to be programmed into MSRs 0x186 IA32_PERFEVTSEL0 and 0x187 IA32_PERFEVTSEL1.   The "43" in the middle corresponds to bit 22 (enable the counter), bit 17 (count while the processor is in kernel mode), and bit 16 (count while the processor is in user mode).

These "config" fields are only relevant for "cpu" counters -- a completely different infrastructure is used for the "untile" counters in the CHA box that include the RxR_OCCUPANCY.IRQ event.  With the "perf stat" command, the specification of the event (using the "-e" flag) would have to refer to the target CHA box.   I have not been able to access the CHA box counters using "perf stat" -- I don't know if I am making mistakes in the "perf stat" command or if this box is not really supported by my revision of the OS (xpssl 1.4.1).

0 Kudos
Reply