Philosophical question arose.
Calculated and checked on a short benchmark define that the processor E5-2680 (for example) with 2.7GHz and 130W in the turboboost mode with a 3.1 frequency and 8 active cores has maximum consumption 154W.
For example we assume that the cooling system can effectively dissipate of 160W heat from the processor for an unlimited time.
Whether the processor in this turboboost mode consume 154W unlimited time (ie always in time above the nominal 130 TDP) or jumps over nominal TDP only for a short time?
If a short time, then how turboboost works on lower consumption? Processor is interrupted at a very very short time (and we have compute jitter) ?
If consumes 154W unlimited time, which means the processor and all its electrical chains and power subsystem designed for 160W instead of 130W ?
Why, then, the manufacturer has not said about this?
Except for your interesting opinions was a good idea to get references to official sources.
Or where I have to contact for this question?
Thanks for you time
>>>Whether the processor in this turboboost mode consume 154W unlimited time (ie always in time above the nominal 130 TDP) or jumps over nominal TDP only for a short time?>>>
I suppose that CPU has spent short period of time in turboboost mode and it is probably strictly related to the infinitesimal consumption of the energy by various units.Probably some kind of algorithm is managing that.For example CPU is lowering its power consumption while fetching data and when the data is available in caches it can for short period of time increase its frequency in order to calculate some kind of flloating point loop.
Just know that if the cooling is poor, turboboost working in burst fashion due to the accumulation of energy.
if these systems handle the load, then the duration of stay above TPD should not be limited (If understand correctly). However, if cooling sit seems to me just what that impact of some settings (motherboard or MSR).
My PSU is more than 2kW. The processor is cooled to 35C. However, the processor sandy bridge e5-2680 after 10 seconds location above the nominal TDP (~ 150W) is returned to the nominal TDP (130W).
Is it possible that the motherboard setup or configuration MSR registers affect to the duration of stay in a mode where the TDP above the nominal ?
I know that some datacenter cooling by water their server processors. They work on a standard motherboard. Ie likely that possible work in 24x7 mode turboboost.
I tried to set directly in MSR_RAPL_PKG_POWER_LIMIT and try to use power_gov utility but it did not help. May have any ideas?
sorry for the long story
The discussion in this "Hot Chips" presentation seems fairly clear:
The chips are allowed to draw 1.2 to 1.3 times the nominal TDP for short periods of time, but the long-term average power will not be allowed to exceed the TDP. Slides 14 and 19 of that presentation give sample values of 20-60 seconds for the sliding window used to determine what "long-term" means.
The maximum turbo frequency is controlled by many variables -- number of active cores, instantaneous current/power/temperature, time-averaged power, and probably others. The Xeon E5-2600 Family Uncore Performance Monitoring Guide has very interesting material in the section on the uncore "PCU" (Power Control Unit). Reading this gave me the impression that the algorithms used to control maximum (and minimum) p-states are hard-coded into the PCU and cannot be modified by the user.
It does seem clear that the average Turbo boost is a function of the effectiveness of the cooling system. The Stampede system at TACC uses a chilled water system to deliver ~46 F water to in-row-chillers interleaved with the server racks. Inlet air temperature is typically 64 F and high air flow rates keep the processors quite cool. I have not done systematic measurements, but I usually see RAPL reported temperatures in the 33 C range. This cooling system allows the Xeon E5-2680 processors (nominally 2.7 GHz) to run at the maximum (all-core-active) Turbo speed of 3.1 GHz almost all the time. I have seen 3.1 GHz maintained even while running DGEMM on all cores, and DGEMM is usually considered to be close to the high end of power-hungry kernels.
If you have the ability to adjust both the airflow rate and the intake air temperature in your cooling system, then you could monitor the PCU counters and see whether the p-state is limited by temperature, power, or current, as the cooling rate and ambient temperature are varied.
It does not look like it is possible to override the maximum Turbo ratio as a function of the number of cores active, but it is possible to maintain that maximum ratio most of the time if the cooling is good enough.
Discussion "Hot Chips" I've seen. confused a few facts :
1) the fact that some companies claim that can significantly increase the time window power limit 2 ( I have not tested it on their servers ) . Although documentation intel I have not seen this feature .
2) ASUS motherboards for overclockers in their manual asserts the existence of a recommendation from Intel 's support turbo mode is not less than 10 seconds. (http://www.manualslib.com/manual/324973/Asus-P8h77-I.html?page=57 Short duration power limit )
Maybe I need to cool the processor to a lower temperature. Just try to look at uncore PCU counters.
thanks for the interesting answer.
I was reviewing the RAPL configuration of my systems (Xeon E5-2680) and realized that the documentation in Section 14.9 of Volume 3 of the Intel Arch SW Developer's Manual is not as clear as I had originally thought....
The default configuration on the TACC systems is a 130 Watt Package Power Limit #1 with a 44 millisecond time window. The limit is enabled but the "package clamping limit" is not. The Package Power Limit #2 is 156 Watts, with a 2.9 millisecond time window. As for limit #1, limit #2 is "enabled", but the corresponding "package clamping limit" is not.
I think that the power limits and windows mean that the PCU will control p-states so that the power averaged over a 44 millisecond sliding window remains at or below 130 Watts, but allows power averaged over a 2.9 millisecond sliding window to bump up to 156 Watts.
The "enable limit" parameters seems straightforward -- they simply mean that these limits are enforced.
The "package clamping limit" parameters are a bit more confusing. They are disabled on my system, but it looks like if they are enabled they allow the PCU to drop processor p-states to below the requested p-states in order to pull the time-averaged power back down to within the limits. If this interpretation is correct, then enabling these bits would allow the PCU to push frequencies a bit higher at the beginning of "busy" times, at the cost of cutting frequencies below nominal if the demand is still high as the end of a time window is reached. Really understanding this would require digging in to the details of the time averaging algorithm, but I don't think that these have been published in detail.
Hello and thank you for your interest and time
Turboboost with above the nominal TDP is observed only in linpack. NPB, Strem and other applications remain within the nominal TDP at a turbo frequency.
I also tried RAPL settings as you. I see roughly the same picture as you. Сlamping bit off, but turboboost with TPD above the nominal run continuously at 10 seconds in linpack.
If I understand correctly RAPL registers allow the processor to tell about how you want to limit power over top. Сlamping enable bit is not particularly clear. If I understand correctly, it gives the OS know that the CPU operates in power limit 2.
I tried to limit power limit 2 to 145W, while the length of stay above TPD some increases.
On different type of servers with same CPU I see a different temperature entering and exiting turbobust mode.
I think the temperature is not the main reason for the CPU return to the nominal TDP and exit in turboboost mode. I thought before that the motherboard has an impact on the power to the CPU, but I can not yet to prove it.
Possible the rate of increase in temperature affects the exit in turbobust mode. Unfortunately algorithm/conditions of entry and exit in/out turbobust mode I have not found
Sorry for the many letters
Hello all. Me again
Several progress in this topic and have reappeared two unsolvable problems. I have to bother you again :)
Quite by chance found a method of changing the duration of stay in the CPU turbo mode. As it is not funny, but is an option in the BIOS "long duration". Able to bring to 70 seconds.
As I understand it, this option configures the BIOS MSR register 0x610 and modifies field power Limit 1 time window [Page 56 http://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-1600-2600-vol-1-datasheet.pdf]
I can change the duration of stay in the turbo mode then I change power limit 1 time window. This is confirmed by the test and statistics turbostat.
However, trouble is that I can not increase (in runtime) duration is higher than specified in the BIOS in item "long duration". I can reduce, but can not increase.
The value is stored in register 0x610. Exit occurs along the boundary set in bios, and not the one that said in the 0x610.
With what it can be connected and how to overcome it?
The second question - the question of monitoring the state of the processor.
Tried to monitor the PCU (FREQ_MAX_OS_CYCLE, FREQ_MAX_POWER_CYCLE, FREQ_MAX_LIMIT_THERMAL_CYCLE, FREQ_MAX_CURRENT_CYCLE) [section 2.6 http://www.intel.com/content/dam/www/public/us/en/documents/design-guides/xeon-e5-2600-uncore-guide.pdf]. I see that when leaving the turbo increasing cycles limited by power and a small percentage of the cycles limited by OS. At the same time limited by temperature cycles that are not increase (maximum temperature 86 C).
How can this be understood? I see that Turbo boost limiter by power.
Is it possible to understand from MSR for whatever reason, the transition from the regime p1 p0 mode and back?
Sorry for the many letters. Thanks for your time