Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

RAPL: does domain energy include subdomain or not?

BenKo
Beginner
1,073 Views

Hi, i'd like to get a definitive answer on whether package energy (domain energy) as reported by RAPL interface includes numa energy (subdomain) or not.

My reading of software developers guide is that the package energy is the energy on CPU die and the memory energy is related to DIMMs. So in order to get total energy/power, packages and numa have to be added up.

However there are also words about hierarchy which may imply inclusion. And tools like powerstat report power as a sum of CPU power only.

Thank you!

0 Kudos
3 Replies
McCalpinJohn
Honored Contributor III
1,073 Views

Package energy is intended to include all of the energy used by the package, including core, uncore, graphics, and the on-chip interfaces to memory and IO.   For the Xeon Phi x200 processors, it also includes the energy consumption of the in-package MCDRAM memory.

"NUMA" is not a term that appears in Intel's RAPL documentation (Section 14.9 of Volume 3 of the Intel Architectures SW Developer's Manual, document 325384-070, May 2019).   The domains listed are Package, PP0, PP1, and DRAM.   Both PP0 and PP1 (if they exist on a platform) are subdomains of Package, while DRAM is independent of Package.

It is important to remember than RAPL is intended to be accurate enough to use for processor performance and thermal management -- not accurate enough to use in place of laboratory measurements.   An implementation might deviate from the target definitions if the difference is small enough to be unimportant for processor performance management.   For example, the Xeon Scalable processors have five different input power voltage planes with different voltages and currents.  The currents on each of these voltage planes may be measured independently, perhaps with different degrees of accuracy.   Different types of measurement errors also have different degrees of importance, with maximum current typically being the most critical to get right.

0 Kudos
BenKo
Beginner
1,073 Views

Thank you, John. I obviously meant DRAM. On a dual socket Skylake I'm seeing package-0 and package-1 and inside each dram0 and dram1 respectively (although rapl:0 contains package-1). I don't have PP0 and PP1.

So you agree that DRAM power is separate from package power? Which means that some subdomains are included and some don't.

I've seen research papers suggesting that RAPL power could be calibrated against lab measurements so that RAPL readings could give a decent estimate of the total power.

I was a bit puzzled to see that idle power on both packages and dram always fluctuates in sync. And I could never get dram power higher than that of related CPU even though I got as close as 5 W while running an asymmetric STREAM load (numactl -N 0 -m 1 stream).

My interest in the RAPL power was prompted by trying to understand and compare two modes offered by modern servers: autonomous and cooperative. Since frequencies cannot be observed if no CPU driver is loaded, checking the power is the only thing I could think of. Under RHEL7, I see these modes behaving practically identically. Either that or the cooperative mode is not really working - the OS hints can't get through. I've seen somewhere that the OS support requires kernel v 4.12 or higher but I'm not sure whether that is true.

0 Kudos
McCalpinJohn
Honored Contributor III
1,073 Views

It is not clear to me whether the package power includes all of the power consumed on all five of the voltage planes connected to the processor.  This information might be available somewhere, but I have not seen it.  Recent processors (starting with Haswell EP) have on-chip voltage regulators, and in most cases it is clear that these provide the current measurements used by RAPL.  For example, the majority of the power is associated with VCCIN, which is down-regulated in-package to provide the voltages required by the individual core p-states.

According to "Second Generation Intel Xeon Scalable Processors Datasheet, Volume 1: Electrical" (document 338845-001US, April 2019), one of the five voltage planes provided to the processor (called VCCD) is a 1.20V supply used for the DDR4 signaling interface.   This supply should not need in-package voltage regulation, since it should match the signaling level provided by the power supply to the DIMMs.   So it is not obvious to me that the processor has an independent ability to measure the current entering the processor on these supply pins -- it might, but it also might not have the voltage regulation infrastructure that makes this convenient.

Table 2-14 shows that the peak current allowable on this interface is 8 Amps, with a "Thermal Design Current" (maximum long-term average) of 6 Amps.   This corresponds to a long-term maximum of 7.2 Watts, which is a fairly small fraction of the overall package power consumption.  It is conceivable that the power consumption within the package on this voltage plane might be estimated based on activity & state (rather than measured), without risking a significant overall error in the package power consumption.

The power consumption of the DRAMs in the DIMMs on the motherboard must come from current sensors in the voltage regulators on the motherboard.  

Some of the ugly details are discussed in a nice paper at https://dl.acm.org/citation.cfm?id=2989088

0 Kudos
Reply