Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Disabling Opportunistic Processor Performance for all Cores

Samuel_M_1
New Contributor I
968 Views

In the Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 3B page 14-4, it says: 

"System software can temporarily disengage opportunistic processor performance operation by setting bit 32 of the IA32_PERF_CTL MSR (0199H), using a read-modify-write sequence on the MSR. The opportunistic processor performance operation can be re-engaged by clearing bit 32 in IA32_PERF_CTL MSR, using a read-modify-write sequence. The DISENAGE bit in IA32_PERF_CTL is not reflected in bit 32 of the IA32_PERF_STATUS MSR (0198H), and it is not shared between logical processors in a physical package."

It is the last sentence that concerns me. How do I disable opportunistic processor performance for all processors at once?

0 Kudos
9 Replies
Bernard
Valued Contributor I
968 Views

What processors are you reffering to?I'm not sure if each logical processor has its own MSR register space.

0 Kudos
Samuel_M_1
New Contributor I
968 Views

I use the following two processors:

Intel Core i5-2415M

Intel Xeon 5650

0 Kudos
McCalpinJohn
Honored Contributor III
968 Views

Recent versions of Linux are capable of disabling "Turbo" frequency boosts using the "acpi_cpufreq" module and its associated drivers.

On a RHEL6 system (2.6.32 kernel), loading the acpi_cpufreq module creates a set of special files in /sys/devices/system/cpu*/cpufreq that can be used to control the frequency.  With the governor set to "userspace"  (echo userspace > scaling_governor), you can then set the specific frequency from the available set (cat scaling_available_frequencies).   On my Xeon E3-1270 system, the "nominal" frequency is 3.4 GHz.  The available frequencies are listed as

          3401000 3400000 3200000 3000000 2800000 2600000 2400000 2200000 2000000 1800000 1600000

If I set the frequency to the highest value ("3401000"), I get 3.4 GHz plus Turbo boost, but if I set the frequency to "3400000", it seems to disable Turbo boost.

This mechanism does not use bit 32 of IA32_PERF_CTL (MSR 0x199).  Instead it sets bits 15:8 of that register to the target frequency multiplier ratio -- setting the expected value of 34 (*100 MHz) when I set the frequency to 3400000 and setting the target multiplier to the maximum value of 38 (*100 MHz) when I set the frequency to 3401000.

Some simple tests using "perf stat a.out" show that the average frequency stays very close to 3.4 GHz when I set it to 3400000, while the frequency increases by various amounts, up to about 3.73 GHz, when I set the frequency control to 3401000.

Of course setting the frequency to 3400000 by this approach also disables *reductions* in frequency via p-states for power saving, so it is not recommended for general application, but it does seem to pin the frequency to a target value and inhibit Turbo boost.   This also works to pin the frequency to any of the other (lower) frequencies, which provides a convenient mechanism to set up performance sensitivity experiments.

I think this also works on the "Westmere" Xeon 5600 processors, but I don't have acpi_cpufreq installed on my nodes to test this.

To get back to your original question -- any settings will have to be done one processor at a time, whichever mechanism is used.   This can be combined into a single script or single call to a kernel module, but doing it all "at once" is not well defined on a system with multiple cores.

0 Kudos
Patrick_F_Intel1
Employee
968 Views

Hello Samuel,

You have to set the bit on each logical cpu (so you have to do this operation over all the cpus). There is no global 'disable' switch.

But you usually can disable it in the BIOS.

And, at one time, (if you using Windows), if you using the 'balanced' or low power performance settings, then Windows disabled turbo mode. But you'd have to check if this is happening on your system.

Pat

0 Kudos
Samuel_M_1
New Contributor I
968 Views

Thank you all for your very useful replies. For now, as I am on Mac OS X, I am using the mp_rendezvous_no_intrs function in a kernel extension to write IA32_PERF_CTL on all cores at once. 

A further question, if anyone has time: Does disabling opportunistic processor performance affect Enhanced SpeedStep, or just TurboBoost?

0 Kudos
Patrick_F_Intel1
Employee
968 Views

As far as I know, setting the bit just affects turbo mode.

Pat

0 Kudos
Bernard
Valued Contributor I
968 Views

Hi Pat,

does logical processor have its own MSR register space?
Thanks in advance

0 Kudos
Patrick_F_Intel1
Employee
968 Views

The SDM Vol 3, table 35-11 indicates the scope of the IA32_PERF_CTL MSR is per thread (that is, per logical cpu).

Pat

0 Kudos
Bernard
Valued Contributor I
968 Views

Thanks Pat.

0 Kudos
Reply