- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I'm testing some SIMD instructions in a Skylake Gold 6148 @ 2.40GHz (20 cores) in a node with two sockets (40 cores). I'm filling all cores with a program executing AVX-512 ADDS, MULS and FMADDS and also setting each core frequency to P_STATE 1 (base frequency, in this case 2.40GHz).
My question is, if the AVX-512 base frequency is 1.6GHz, and I'm making intense use of this AVX512 instructions, why I get a frequency measurements around 2.2GHz? I supposed that I would get a frequency measurements around 1.6GHz, because I set the P_STATE 1, which means NO TURBO, and 2.2GHz could be considered as AVX-512 turbo frequency.
Thank you,
Jordi.
- Tags:
- Intel® Advanced Vector Extensions (Intel® AVX)
- Intel® Streaming SIMD Extensions
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
(1) You need to make sure that HWP is not enabled (MSR 0x770) -- once it is enabled, the legacy interfaces to control frequency are ignored. HWP is described in Chapter 14 of Volume 3 of the Intel Architectures SW Developer's Manual.
(2) Assuming that HWP is not enabled, you need to check MSR_TURBO_ACTIVATION_RATIO (0x64c). This was introduced in the Ivy Bridge core and tells the processor to assume that a request for this frequency or higher should be interpreted as a request for maximum Turbo frequency. I have frequently seen this misconfigured by the system BIOS.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the answer John,
I forgot some details. I'm using the 'acpi-cpufreq' driver. As far as I know, it works like 'intel_pstate' driver passive mode, where HWP is off. Maybe my best option is to know this driver, and how manages it's frequencies and scaling under the set frequency (and turbo is tuned off).
Thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
My recollection is that the "acpi-cpufreq" driver does not allow requests for frequencies between P1 and P0, but it should allow requests for P1. This is where the MSR_TURBO_ACTIVATION_RATIO (0x64c) comes into play. I have seen multiple systems that programmed this incorrectly.
Example:
- Nominal frequency (P1) is 2.6 GHz, so the max non-Turbo ratio (bits 15:8 of MSR_PLATFORM_INFO (MSR 0xCE)) is 0x1a (26 decimal).
- MSR_TURBO_ACTIVATION_RATIO (MSR 0x64c) bits 7:0 is programmed to 0x19 (25 decimal).
- Any request for a ratio higher than the value in MSR_TURBO_ACTIVATION_RATIO bits 7:0 is interpreted as a request for maximum Turbo frequency, so the request for P1 is incorrectly interpreted as a request for P0.
- Fortunately, the MSR_TURBO_ACTIVATION_RATIO register was not locked on any of the platforms where it was incorrectly configured, so I could simply re-write it with the correct value.
- The correct value is the same as the max non-Turbo ratio from bits 15:8 of MSR_PLATFORM_INFO
Other platforms set MSR_TURBO_ACTIVATION_RATIO bits 7:0 to 0xff, so this mechanism is never triggered.
The MSR_TURBO_ACTIVATION_RATIO MSR is not included in the Xeon Scalable processor (Skylake Xeon) section of Volume 4 of the Intel SWDM, but the register is readable on my systems. I have not checked to see if it actually does anything. I have a small SKX test cluster with HWP enabled, while the larger SKX cluster has the frequencies pinned to the maximum values by the BIOS (so no frequency controls are supported).

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page