Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Daniele_C_
Beginner
1,773 Views

Monitoring C1 and C1E core c-state

Hi,

I would ask you if it's possible to monitor C1 and/or C1E core c-state. Currently, I can monitor only C3/C6 core c-state and C2/C3/C6 package c-state.  I'm using the following MSRs to find out the residency:

Core c-state residency
MSR_CORE_C3_RESIDENCY    0x3FC
MSR_CORE_C6_RESIDENCY    0x3FD

Package c-state residency
MSR_PKG_C2_RESIDENCY    0x60D
MSR_PKG_C3_RESIDENCY    0x3F8
MSR_PKG_C6_RESIDENCY    0x3F9

I'm working on a server equipped with two Intel Xeon CPU E5-2630 v3 @ 2.40GHz (Haswell).

I tried to use MSR_CORE_C1_RESIDENCY (0x660) but on my CPU version I cannot read. My purpose is to trace all the idle c-states.

Thanks in advance,
Daniele Cesarini

0 Kudos
4 Replies
McCalpinJohn
Black Belt
1,773 Views

I don' t know of any hardware counters for Core C1 residency, but this is a software-controlled state, so (in theory) it is possible to monitor it in the OS.    Section 8.10.6.4 of Volume 3 of the Intel Architectures SW Developers Manual (document 325384, revision 058) discusses the use of the MONITOR and MWAIT instructions in the kernel's C1 idle loop.  (I don't know if all recent operating systems use MONITOR/MWAIT instead of HALT, but I suspect that they all do?  Not sure about whether this is also used with HyperThreading, e.g., Section 8.10.6.3.)

C1E is only defined as a package state, and there is a performance counter event in the PCU of the Uncore to count package C1E residency.  Look for the "Intel Xeon Processor E5 and E7 v3 Family Uncore Performance Monitoring Reference Manual", document 331051.  My version is revision 002 from June 2015.   Section 2.8 describes the PCU events.

wangxiaohui
Beginner
1,773 Views

Hi  John,  Can you sent me the URL for Intel C_states Residency Monitor ?

Thank you !

McCalpinJohn
Black Belt
1,773 Views

In a minor correction to the esteemed Dr. Bandwidth of 2016, I would like to note that "C1E" can refer to either a core C1E state (CC1E) or a package state (PC1E -- not to be confused with PCIe!).  These C-states are all model-dependent, so they don't get a lot of documentation.

In the Linux OS, the Intel "cpuidle" driver provides interfaces to core C1E via the MONITOR/MWAIT mechanism.  If the Intel cpuidle driver is active, the system will typically create sysfs files to interact with the driver -- e.g. /sys/devices/system/cpu/cpu0/cpuidle/state?/*

On a Xeon Platinum 8280 system with the cpuidle driver running, there are four "state" directories, for states 0, 1, 2, 3.  Each of those directories has seven files that provide information about the characteristics of the state and the usage of the state.  For this system, 

  • state0 "POLL" (cpu spin-waits because it is expected to get more work very soon -- < 10 microseconds)
  • state1 "C1-SKX" -- default behavior of MWAIT with argument EAX=0x00 -- has 2 microsecond wakeup latency
  • state "C1E-SKX" -- MWAIT with argument EAX=0x01 -- has 10 microsecond wakeup latency and drops core to maximum efficiency frequency
  • state "C6-SKX" -- MWAIT with argument EAX=0x20 -- has 133 microsecond wakeup latency and turns off core (allowing more power to other cores)

I am not absolutely certain where the Linux folks got their information, but it is out there in the published drivers.

"Package C1E" is a state set by the hardware -- if all cores are in the C1 state (or higher-numbered states), the hardware will automatically drop their frequency to the maximum efficiency frequency (and will typically drop the uncore frequency to the minimum as well).   This feature is controlled by bit 1 of MSR 0x1FC MSR_POWER_CTL, as described for several processors in the relevant sections of Volume 4 of the Intel Architectures SW Developer's Manual (document 335592).

HadiBrais
New Contributor III
1,773 Views

There is no counter for CC1 residency on Haswell. It's also possible for the hardware to automatically enter CC1 as discussed in Section 4.2.4 of the processor's datasheet. This is called C1 auto-demotion and can be disabled by setting MSR_PKG_CST_CONFIG_CONTROL[26] to zero. The manual says that this feature is disabled by default, but it's enabled on my Haswell system, probably by the BIOS.

There is also no counter for CC1E residency on Haswell (or any other microarchitecture). Moreover, if MSR_POWER_CTL[1] is set to one, CC1 can be automatically promoted to CC1E by the hardware. It's not clear to me whether this is enabled by default, but on my system, it's disabled.