Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

hardware performance counters

NPund
New Contributor I
1,105 Views

Hi all,

Thank you for all the previous help as I am still new to performance counters.

 

So, I understood there are 8 general purpose counters and 3 fixed counters per core.

I can use rdmsr to read the counter values at addresses 0xc1, 0xc2 (for general) and 0x309,0x30a, 0x30b for fixed.

I also tried intel pcm from github but that supports only specific hardware events which I can monitor.

So my question is:-

1. How can I map specific hardware event to one of the general purpose counter and read it?

2. wrmsr doesn't work on ubuntu 16 - pwrite: operation not permitted. Any work around?

3. Also, can perf allow us to access specific register to pin the event to?

 

I have Kaby Lake microarchitecture. 

 

Thanks

0 Kudos
1 Solution
Thomas_G_4
New Contributor II
1,105 Views

Hi,

There are 8 general-purpose counters when HyperThreading is disabled. When it is activated each SMT thread has 4 general-purpose counters. There are always 3 fixed counters independent of activated or deactivated HyperThreading.

1.)
The general-purpose counters consist of a pair of registers. One to configure the event and one for counting. The configuration registers start at 0x186 and the counter registers start at 0x0C1 (second is 0x187 and 0x0C2 and so forth). So when you want to map an event to the first general-purpose counter, configure the event at 0x186 and read the results from 0x0C1. There is also a config register that covers all core-local registers (0x38F), an overflow status register (0x38E) and an overflow control register (0x390).
For the fixed-purpose counters you probably need the control register (0x38D) as well. There you start/stop/configure the fixed events. The configuration options are pretty limited.

 

2.)
Commonly, as root you are allowed to read and write the registers but there was a change in distributions' kernel that affects the writing of the registers through the msr kernel device driver. When your system is in booted in "Secure Boot", writes to MSRs are not allowed. I don't know whether this patch is already present in Ubuntu 16 (Link , it is in Ubuntu 17). The "vanilla" kernel does not include this patch (Link).

3.)
I havn't found a way to specify which registers should be used by perf for x86 systems. For PowerPC there is a range in the config field of the perf_event struct to specify the register pair.

Best,
Thomas

View solution in original post

0 Kudos
3 Replies
Thomas_G_4
New Contributor II
1,106 Views

Hi,

There are 8 general-purpose counters when HyperThreading is disabled. When it is activated each SMT thread has 4 general-purpose counters. There are always 3 fixed counters independent of activated or deactivated HyperThreading.

1.)
The general-purpose counters consist of a pair of registers. One to configure the event and one for counting. The configuration registers start at 0x186 and the counter registers start at 0x0C1 (second is 0x187 and 0x0C2 and so forth). So when you want to map an event to the first general-purpose counter, configure the event at 0x186 and read the results from 0x0C1. There is also a config register that covers all core-local registers (0x38F), an overflow status register (0x38E) and an overflow control register (0x390).
For the fixed-purpose counters you probably need the control register (0x38D) as well. There you start/stop/configure the fixed events. The configuration options are pretty limited.

 

2.)
Commonly, as root you are allowed to read and write the registers but there was a change in distributions' kernel that affects the writing of the registers through the msr kernel device driver. When your system is in booted in "Secure Boot", writes to MSRs are not allowed. I don't know whether this patch is already present in Ubuntu 16 (Link , it is in Ubuntu 17). The "vanilla" kernel does not include this patch (Link).

3.)
I havn't found a way to specify which registers should be used by perf for x86 systems. For PowerPC there is a range in the config field of the perf_event struct to specify the register pair.

Best,
Thomas

0 Kudos
NPund
New Contributor I
1,105 Views

Hello Thomas,

Thanks for the reply, It was seriously very helpful. I have just some naive questions also.

1. What are these 0x186, 0x187 ... registers called, so that I can learn more about them in the developers manual.

2. I have configured 0x186 to monitor specific event lets say EVENT_ITLB_MISSES 0x85. But how do I start the counter? Is there any other register to start the counting?

Thanks a lot

Nitin

0 Kudos
Thomas_G_4
New Contributor II
1,105 Views

Hi Nitin,

1.) They are called IA32_PERFEVTSELx or only PERFEVTSELx (x in 0-7). You can find them in the Architectural MSRs section (SDM, Vol. 4, Chapter 2.1)

2.) The mentioned chapter of the SDM contains the description of the bits and bit ranges of the registers. For EVENT_ITLB_MISSES you write the event code 0x85 into bits 0-7. If the event can be further specified by a umask, put the umask in bits 8-15. In order to start counting you have to set the bit 22. To stop counting you can unset this bit again. There are some more bits in the registers which might be helpful.
In my previous post I mentioned the config register that covers all counters. There are also enable bits, so enable the counter you want there too! If you have exclusive access to the registers, you can set bit 22 when setting up the event configuration and use the global config register to start & stop all counters with one operation. This global config register is called IA32_PERF_GLOBAL_CTRL, the overflow status register IA32_PERF_GLOBAL_STATUS and the overflow control register IA32_PERF_GLOBAL_OVF_CTRL or IA32_PERF_GLOBAL_STATUS_RESET.

Best,
Thomas

0 Kudos
Reply