RAPL DRAM energy units for Xeon Phi and Broadwell Xeon D

CImes · ‎05-12-2018

Hi - I was referred here from the Intel Communities forum.

According to the Software Developer's Manual, Volume 4, some processors use a non-standard 15.3 uJ energy unit for DRAM (when reading the MSR_DRAM_ENERGY_STATUS register) instead of using what's specified in the MSR_RAPL_POWER_UNIT register.

As of May 2018, the SDM does NOT appear to specify 15.3 uJ for Xeon Phi CPUs (Family_Models 06_57H and 06_85H in Section 2.17), but the community consensus seems to be that these two processors do in fact use 15.3 uJ as their energy unit (see the Linux kernel, PCM, PAPI).

Conversely, the Xeon D (Family_Model 06_56 in Sections 2.14 - 2.15.1), through inheritance from Table 2-35, claims to use the 15.3 uJ DRAM energy unit, but the community seems split on what it really is (e.g., the Linux kernel does NOT use the 15.3 uJ unit configuration).

Does anybody actually know what ground truth is for these processors? Also, if the SDM is incorrect, can we expect it to be corrected? It is difficult to keep software configurations correct when the documentation is wrong.

Much appreciated,
-Connor

McCalpinJohn · ‎05-13-2018

My measurements suggest very strongly that the Xeon Phi x200 documentation is wrong in this regard, and that it does use the alternate 15.3 uJ DRAM energy unit that other processors (starting with Xeon E5 v3) use. It is not hard to put bounds on these values, either by documentation (e.g., Micron's DRAM power calculators) or by direct comparison (e.g., 6 channels of DDR4 on a KNL vs 6 channels of DDR4 on a Skylake Xeon), and these show that using standard RAPL energy unit for DRAM energy on KNL give impossible results, while assuming 15.3 uJ gives very reasonable results.

I am a bit surprised that Intel has not corrected this documentation error in the SWDM or the Xeon Phi Datasheet or the Xeon Phi Specification Update or the Xeon Phi External Design Spec, or anywhere else.

Intel seems to bypass standard documentation in getting information to the Linux kernel people (or else Intel employees just release the code containing the right answers). For KNL, for example, the Linux kernel is aware that the APERF and MPERF registers increment once every 1024 clock cycles. This is a bizarre configuration that does not appear to be used in any other Intel processor, but the kernel knows about it. I spent quite a bit of time looking for any evidence of this "feature" in all the usual documentation sources (and lots of unusual ones) but found no hints. Frustrating and time-wasting....

CImes · ‎05-13-2018

Thanks for your reply, John. I've seen your other related posts here and see you've done a lot of work one some of these systems. I don't mind the kernel getting fixes before the documentation, but eventually the docs need to be updated for the rest of us. The kernel is not always perfect, even when Intel employees write the relevant parts. Nor can we just simply reference the kernel for all of our needs, particularly when writing non-GPL software. I abhor changing software based on our own estimations that conflict with the documentation - the evidence and the "why" tends to be lost over time, even with good code comments, making maintenance problematic. Additionally, we don't always have access to some of the hardware we want to support.

Obviously docs are not always correct and, as a seasoned developer, I sympathize with the effort required to maintain them. However, this topic has come up on a handful of occasions. When a problem such as this is identified, we should be able to get feedback from the source (Intel) and updates to future versions of the docs. I'd really appreciate a good faith effort from Intel to comment on this for us.

Cheers.