Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Power capping in Intel processor

techsun
Beginner
1,483 Views

Hi,

I am using RAPL MSR for applying the power cap on the Cascadelake processor. As far as I know, power capping internally uses Dynamic Voltage and Frequency Scaling (DVFS) and Dynamic Duty Cycle Modulation (DDCM). So, I was reading this file "/sys/devices/system/cpu/cpu*/cpufreq/scaling_cur_freq" to see changes in core frequency for different power cap value. But, It doesn't show any change. It always shows the maximum frequency. Is there any other way to check the exact CPU frequency for a specified power cap?

 

0 Kudos
4 Replies
McCalpinJohn
Honored Contributor III
1,450 Views

The recommended way to check the CPU frequency is to measure it over an interval.

0 Kudos
techsun
Beginner
1,421 Views

Hi John,

Thank you for your reply.

Could you please explain in more detail how to check CPU frequency over an interval? 

0 Kudos
McCalpinJohn
Honored Contributor III
1,411 Views

Short answer:

Install and run a tool like "turbostat".   I use "turbostat --Summary" in another window to get a summary of the average frequency while the workload is running.

 

Long answer:

The core frequencies on recent Intel processors are dynamically adjusted on a relatively short time scale (1 millisecond) in response to many conditions -- including instruction set, workload, temperature, power consumption, and the estimated trade-off between power consumption and performance.  These adjustments are done by the "Power Control Unit" (PCU) on the processor.   Software can adjust some of the parameters, but cannot override these controls entirely, so "frequency" should always be thought of as a time-varying quantity associated with the combination of:

  • the set of hardware & software configuration values,
  • the workload of interest, and
  • the "environment" (specifically the effectiveness of the cooling system).

So once you have picked a workload of interest, you need to pick a time interval over which to perform the measurements.  The PCU will try very hard to reduce power consumption while the processor is idle, so you will generally want to measure over time intervals that are either (1) intervals in the middle of executions of long duration, or (2) executions that are long enough that the "start-up" values don't significantly reduce the average.   In either of these cases I recommend intervals of 10's of seconds if the workload is of long duration.  (This is not necessary in all cases, but there are some system configurations that require several seconds to ramp up from the idle state to full performance and the resulting average frequencies will be misleadingly low.)

 

There are three pieces of information needed to understand the average frequency:

  1. The elapsed wall-clock time for the interval
    • This can be obtained from the Time Stamp Counter (TSC), which increments at the *nominal* processor frequency.
    • The TSC can be read in user mode using the RDTSCP instruction.
  2. The fraction of the wall-clock time that the core was "active" during the interval.
    • The "fixed-function performance counter 2" increments at the same rate as the TSC, but only increments while the core is active.
    • The change in CPU_CLK_Unhalted.REF divided by the change in TSC is therefore the fraction of time that the core was active.
  3. The number of active core clocks during the interval
    • The "fixed-function performance counter 1" increments for each core clock while the core is active.
    • The change in CPU_CLK_Unhalted.CORE divided by the change in CPU_CLK_Unhalted.REF is therefore the ratio of the core frequency (while active) to the reference (nominal) frequency.

I prefer to get these values directly from the hardware, using functions from https://github.com/jdmccalpin/low-overhead-timers. (This approach requires that the workload be "pinned" to execute on a single logical processor so that the difference in cycle counts will make sense.)

It is also possible to get these values from "perf stat", though it is more difficult to understand precisely what is being reported.

 

0 Kudos
techsun
Beginner
1,369 Views

Thanks, John.

It is really helpful.

0 Kudos
Reply