- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'd like to know whether there is a set of control instructions for programmers to set Xeon Phi in different energy/power states (C0, C1, C3, C6)? Thanks!
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Vincent, This covers an area that I have not directly talked about, namely how the OS implements a power management policy. The short answer is that there are no instructions or libraries in user space that control power state transitions. There are some management instructions (e.g. micsmc) that permit you to change parts of the power management policy, such as enabling or disabling P-states, but that's about as far as you can go. That is it. If you want a more long winded answer, continue reading. Regards
Now here is the longer answer: There are machine control instructions that put the processor into various P and C states, but they are not accessible to user processes and are buried deep in the kernel. A common misunderstanding is how power and idle states are actually implemented. Let us take a look at C-states first. An idle / C-state is not entered so much as it is used. Here is what I mean. Work can only be done in C0, the run state. Nothing can be done in an idle state, by definition. So when we say a processor is operating in C3, what is really happening is that it is transitioning between C0 and C3 as work needs to be done or is completed. Here is an example that hopefully makes this clearer. Processor in C3 (idle) => Interrupt comes in indicating some work is to be done => processor enters C0 (run) and does whatever it needs to do => OS sends processor back into C3 (idle). In a simple sense, what is happening is C3 => C0 => C3 => C0 => C3 => C0 => C3 and so on. So what does it mean to descend from C3 into C6? It means that the processor will start oscillating between C0 and C6. The difference between the two (C0<=>C3, versus C0<=>C6) is the amount of power savings provided by the idle state, and the latency required to go from the idle state back to the run state (C0). As you can see, the OS does not want a user program to be mucking about in the idle state domain. Let us look at P-states: Whereas C-states are confusing and complicated, P-states are downright dangerous. Given that processors can be overclocked for short periods of time, an incorrect selection of a P-state can fry your processor. Once again, the OS does not want to give this power to a user domain program. Does this answer your question? Or have I missed the mark?
|
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We run our Xeon Phi Coprocessors in a fixed p-state configuration, so I can't comment on how they work there.
But... on the mainline Xeon processors the hardware will happily ignore p-state requests for higher frequencies (lower p-state numbers) than the hardware cares to provide. When requesting maximum performance, the Linux acpi-cpufreq infrastructure will request the maximum supported core multiplier ratio (e.g. 35 on the Xeon E5-2680) and the hardware will decide what the actual multiplier ratio will be. When all cores are operational (C0 or C1 state), the hardware limits the actual multiplier of each core to 31, while for fewer operational cores the actual multiplier can reach slightly higher values. On my Xeon E5-2680's, testing with various benchmarks found that the actual multiplier (based on average frequency) matched the documented maximum Turbo boost in every case but one -- for 4 cores active I consistently got an average frequency that was one step lower than the documented hardware maximum. It appears to take a fairly aggressive cooling system to consistently get the maximum frequency boosts.
The important thing is that in all of these cases the requested multiplier was 35 -- the OS left the detailed decision-making to the hardware. This can be checked easily enough on mainline processors by reading MSR 0x198 (actual core multiplier ratio in bits 15:8) and MSR 0x199 (requested core multiplier ratio in bits 15:8).
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. It answered my question.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page