What I understand you can set power limit to whole package using RAPL. Is there any explanation how this is achieved? Each core is assigned to same p-state ? Is there a description out there?
Not necessarily the internal implementation but I couldn't find how each core contributes to overall power consumption of the package. So what I am looking is an algorithm or scheme that explains how each core's power limit is determined to achieve overall power consumption limit.
The Intel SDM vol 3 talks around this topic. See http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-... section 14 for the RAPL (Running Average Power Limit) interface and MSRs. All the cores are 1 power plane and the GT graphics unit is on another power plane. You can limit the power used by the cores and the GPU. And you can set a preference for which has priority when both the CPU & GPU want to run. From a crude test I ran trying to load up the CPU & GPU, the CPU gets the frequency reduced to stay within the overall package thermal limits.
There is a nice presentation at hotchips.org on Haswell which talks about power management. See http://www.hotchips.org/wp-content/uploads/hc_archives/hc25/HC25.80-Processors2-epub/HC25.27.820-Has...
From the above presentation, it looks like haswell can set independent voltages for each core (and probably independent frequencies for each core). And the ring/LLC has its own voltage and the GPU has its own voltage. So there are many variables which can impact power usage.
Please read also this paper about the power management of Sandy Bridge : http://www.hotchips.org/wp-content/uploads/hc_archives/hc23/HC23.19.9-Desktop-CPUs/HC23.19.921.SandyBridge_Power_10-Rotem-Intel.pdf