Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Jordi_V_
Beginner
132 Views

If the frequency is set to the P_STATE 1, why AVX-512 is not running to its base frequency?

Hello,

I'm testing some SIMD instructions in a Skylake Gold 6148 @ 2.40GHz (20 cores) in a node with two sockets (40 cores). I'm filling all cores with a program executing AVX-512 ADDS, MULS and FMADDS and also setting each core frequency to P_STATE 1 (base frequency, in this case 2.40GHz).

My question is, if the AVX-512 base frequency is 1.6GHz, and I'm making intense use of this AVX512 instructions, why I get a frequency measurements around 2.2GHz? I supposed that I would get a frequency measurements around 1.6GHz, because I set the P_STATE 1, which means NO TURBO, and 2.2GHz could be considered as AVX-512 turbo frequency.

Thank you,

Jordi.

0 Kudos
3 Replies
McCalpinJohn
Black Belt
132 Views

(1) You need to make sure that HWP is not enabled (MSR 0x770) -- once it is enabled, the legacy interfaces to control frequency are ignored.  HWP is described in Chapter 14 of Volume 3 of the Intel Architectures SW Developer's Manual.

(2) Assuming that HWP is not enabled, you need to check MSR_TURBO_ACTIVATION_RATIO (0x64c).  This was introduced in the Ivy Bridge core and tells the processor to assume that a request for this frequency or higher should be interpreted as a request for maximum Turbo frequency.  I have frequently seen this misconfigured by the system BIOS.
 

Jordi_V_
Beginner
132 Views

Thank you for the answer John,

I forgot some details. I'm using the 'acpi-cpufreq' driver. As far as I know, it works like 'intel_pstate' driver passive mode, where HWP is off. Maybe my best option is to know this driver, and how manages it's frequencies and scaling under the set frequency (and turbo is tuned off).

Thank you. 

McCalpinJohn
Black Belt
132 Views

My recollection is that the "acpi-cpufreq" driver does not allow requests for frequencies between P1 and P0, but it should allow requests for P1.  This is where the MSR_TURBO_ACTIVATION_RATIO (0x64c) comes into play.  I have seen multiple systems that programmed this incorrectly.

Example:

  • Nominal frequency (P1) is 2.6 GHz, so the max non-Turbo ratio (bits 15:8 of MSR_PLATFORM_INFO (MSR 0xCE)) is 0x1a (26 decimal).
  • MSR_TURBO_ACTIVATION_RATIO (MSR 0x64c) bits 7:0 is programmed to 0x19 (25 decimal).
  • Any request for a ratio higher than the value in MSR_TURBO_ACTIVATION_RATIO bits 7:0 is interpreted as a request for maximum Turbo frequency, so the request for P1 is incorrectly interpreted as a request for P0.
    • Fortunately, the MSR_TURBO_ACTIVATION_RATIO register was not locked on any of the platforms where it was incorrectly configured, so I could simply re-write it with the correct value.
    • The correct value is the same as the max non-Turbo ratio from bits 15:8 of MSR_PLATFORM_INFO

Other platforms set MSR_TURBO_ACTIVATION_RATIO bits 7:0 to 0xff, so this mechanism is never triggered.

The MSR_TURBO_ACTIVATION_RATIO MSR is not included in the Xeon Scalable processor (Skylake Xeon) section of Volume 4 of the Intel SWDM, but the register is readable on my systems.  I have not checked to see if it actually does anything.  I have a small SKX test cluster with HWP enabled, while the larger SKX cluster has the frequencies pinned to the maximum values by the BIOS (so no frequency controls are supported).

Reply