Constant energy per instruction, per uop or per cycle?
I'm doing a research study in response to papers such ascseweb.ucsd.edu/users/swanson/papers/Asplos2010CCores.pdf, to determine whether the energy-intensive code in real-world applications and games is as stable as in the benchmarks and utilities they've studied. (My opinion is probably not.)
To determine which code is energy-intensive, I'm doing some profiling with VTune on my Core i7 2600K. (To determine how stable it is, I'm combining with gcov and svn blame to get the average age of each line of code executed.)
Among those that can be measured at the function level, which hardware event is safest to assume constant energy for, if clock speed is held constant? (I'm not counting the idle-power baseline or cache/memory/I/O access, since C-cores wouldn't do anything about those anyway.) Constant per instruction? Per cycle? Per uop? A linear combination of multiple metrics?
Also, I'm thinking I'll probably have to use Callgrind instruction counts to model 64-bit x86s (Intel or not) that aren't similar enough to Sandy Bridge, since I don't have any other CPUs to profile on. Is there a better way to model them in Vtune?
EDIT: I'm on Linux, so integrating the Intel Energy Checker SDK with Vtune isn't an option according to the former's FAQ.