Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Is there an energy profiling tool to monitor CPU power consumption on Linux?

Iulia_S_
Beginner
18,218 Views

Hello,

Can you please tell me if there is any energy profiling tool to monitor CPU power consumption on Linux for desktops? Or is there any other way I can obtain the cpu power consumption? I have used powertop so far, but requires running on baterry mode to display the report, so it does not work for my workstation.

Kind regards,
Iulia

0 Kudos
18 Replies
Joel_L_Intel
Moderator
18,215 Views

You can expect the latest Intel SoC Watch going to be released in Intel System Studio 2018. This is going to happen soon in few weeks.

0 Kudos
Thomas_G_4
New Contributor II
18,215 Views

You can use perf (if the kernel supports it), LIKWID or, if available, the files below /sys/devices/virtual/powercap/intel-rapl/ . There are probably also some other tools.

0 Kudos
Iulia_S_
Beginner
18,215 Views

Thank you both for the answers.
I tried likwid command from here:
$ sudo likwid-perfctr -c S0:0-3 -g ENERGY -m ./bench/likwid-bench  -i 50 -t stream_avx  -w S0:1GB:4
but the result was:
Illegal instruction (core dumped)
Failed to execute command: ./bench/likwid-bench -i 50 -t stream_avx -w S0:1GB:4
Marker API result file does not exist. This may happen if the application has not called LIKWID_MARKER_CLOSE.

I am using likwid 4.2.1.

I tried also:
$ sudo likwid-powermeter -i
but it won't display the power information as here.

Can you please help?

0 Kudos
Thomas_G_4
New Contributor II
18,215 Views

Illegal instruction sounds as your architecture is not supporting AVX (try stream instead of stream_avx). Which CPU architecture is it? If you don't know, try likwid-topology and look at the header.

CPU name:	Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
CPU type:	Intel Core Haswell processor


With this information it is easy to say whether your system supports AVX and power readings. The first Intel CPU supporting AVX and power readings is Intel SandyBridge.

Commonly, you don't need to run LIKWID with sudo. Are you using a package provided by your distribution or built it yourself?
If you used the standard make && sudo make install way, you should be able to run it as user:
$ likwid-perfctr -c 0 -g ENERGY hostname
(The -m switch on the command line activates the MarkerAPI, some simple code instrumentation calls, but you have to activate the support for the MarkerAPI for likwid-bench in config.mk before compilation)

Does likwid-powermeter print anything at all?



 

0 Kudos
Travis_D_
New Contributor II
18,215 Views

You can use turbostat which is written by Intel folks, is part of Linux and shows you the PkgWatt and RAMWatt with the --debug option:

GFX%C0	CPUGFX%	Pkg%pc2	Pkg%pc3	Pkg%pc6	Pkg%pc7	Pkg%pc8	Pkg%pc9	Pk%pc10	PkgWatt	RAMWatt	PKG_%	RAM_%
1.42	0.33	9.86	0.32	1.33	0.00	62.44	0.00	0.00	1.22	0.97	0.00	0.00

 

0 Kudos
Iulia_S_
Beginner
18,215 Views

Travis D. wrote:

You can use turbostat which is written by Intel folks, is part of Linux and shows you the PkgWatt and RAMWatt with the --debug option:

GFX%C0	CPUGFX%	Pkg%pc2	Pkg%pc3	Pkg%pc6	Pkg%pc7	Pkg%pc8	Pkg%pc9	Pk%pc10	PkgWatt	RAMWatt	PKG_%	RAM_%
1.42	0.33	9.86	0.32	1.33	0.00	62.44	0.00	0.00	1.22	0.97	0.00	0.00

Hello,
I tried to run turbostat command:
$ sudo turbostat --debug
which failed with the output:
turbostat: msr 3 offset 0x1aa read failed: Input/output error

I found a similar problem here, but I couldn't apply the patch with the fix. Can you please help with that?
What is the meaning of PkgWatt? The power consumption over all selected cpus?

0 Kudos
Iulia_S_
Beginner
18,215 Views

Thomas R. wrote:

Illegal instruction sounds as your architecture is not supporting AVX (try stream instead of stream_avx). Which CPU architecture is it? If you don't know, try likwid-topology and look at the header.

CPU name:	Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz
CPU type:	Intel Core Haswell processor

With this information it is easy to say whether your system supports AVX and power readings. The first Intel CPU supporting AVX and power readings is Intel SandyBridge.

Commonly, you don't need to run LIKWID with sudo. Are you using a package provided by your distribution or built it yourself?
If you used the standard make && sudo make install way, you should be able to run it as user:
$ likwid-perfctr -c 0 -g ENERGY hostname
(The -m switch on the command line activates the MarkerAPI, some simple code instrumentation calls, but you have to activate the support for the MarkerAPI for likwid-bench in config.mk before compilation)

Does likwid-powermeter print anything at all?

 

The package was built by myself. I set to true FORTRAN_INTERFACE and INSTRUMENT_BENCH in config.mk, then I used:
$ make
$ sudo make install
Is there anything else to do besides above to activate MarkerAPI?

I think I activated likwid-bench and it accepts also stream_avx because I get the following output:
LIKWID MICRO BENCHMARK
Test: stream_avx

My CPU type is Intel Atom (Silvermont) processor.

likwid-powermeter prints something.
The following command works:
$ likwid-perfctr -c 0 -g ENERGY hostname
but the values are constant. Is there any way to generate a report for the cpu power consumption for 15 minutes with a sample rate of 1 second?

Many thanks,
Iulia

 

 

0 Kudos
Thomas_G_4
New Contributor II
18,215 Views

I have no clue about turbostat but the patch seems to tell it that a specific register is not available on Silvermont cores but on more recent Atom chips (Goldmont). LIKWID does not use this register at all. Applying the patch requires you to get the kernel sources, unpack them, go into the folder an run patch -p1 < turbostat.patch . You can download the patch from the link you posted in the left menu (at the bottom).

The RAPL domains are not named intuitively, so PKG for energy of cores+uncore, PP0 for cores-only, PP1 for uncore-only and DRAM reflects the energy consumed by the memory dimms.

Silvermont chips do not support AVX, so the stream_avx benchmark won't work but you can use stream_sse.

To run for 15 minutes with one sample per second: likwid-perfctr -c 0 -g ENERGY -t 1s sleep 900
You will get one sample per line written to stderr, hence to pipe it into a file you have to do 2>filename.log.

0 Kudos
McCalpinJohn
Honored Contributor III
18,215 Views

I don't see MSR 0x1aa defined for the Silvermont processor in Volume 4 of the Intel Architecture SW Developer's Manual (document 335592-064, October 2017), but it does appear starting in the Goldmont processor.

Section 2.4 of Volume 4 of the SWDM discusses the MSRs for the various processors with Silvermont cores.  Tables 2-6 and 2-7 provide the basic information, while Tables 2-8 and 2-10 show some differences in the behavior of the RAPL functionality for different Silvermont-based processors.

The encoding of the RAPL energy units in Intel processors is weird, but there are some fairly easy-to-use example codes on the interwebs.  One possible candidate for building your own code is https://github.com/deater/uarch-configure/blob/master/rapl-read/rapl-read.c

 

0 Kudos
Travis_D_
New Contributor II
18,215 Views

You can also try to build Intel PCM and then use pcm-power which gives a crapload of power info if it works on your CPU (it doesn't on my Skylake (?), too new I guess).

0 Kudos
Iulia_S_
Beginner
18,215 Views

Thomas R. wrote:

To run for 15 minutes with one sample per second: likwid-perfctr -c 0 -g ENERGY -t 1s sleep 900

The above works, but I don't get to see the table head to figure out what the columns represent. I want to get the power consumption of each core, but if I select many cores, not just core 0, I get the division by zero error.

Can you please help to display the table head (and which column represents the power consumption for the core?) and explain why the division by zero error?

Many thanks! Regards!

0 Kudos
Iulia_S_
Beginner
18,215 Views

Travis D. wrote:

You can also try to build Intel PCM and then use pcm-power which gives a crapload of power info if it works on your CPU (it doesn't on my Skylake (?), too new I guess).

I tried to run the following:
./pcm-power.x -p 0 -m -1 -- /bin/sleep 5
and the output is:  
Processor Counter Monitor  ($Format:%ci ID=%h$)  Power Monitoring Utility
Error: NMI watchdog is enabled. This consumes one hw-PMU counter
 to disable NMI watchdog please run under root: echo 0 > /proc/sys/kernel/nmi_watchdog
 or to disable it permanently: echo 'kernel.nmi_watchdog=0' >> /etc/sysctl.conf
Unsupported processor model (28).

I tried to execute the second echo command with sudo and change the permission for /etc/sysctl.conf file. Then, I rerun Intel PCM but the output is the same.

Does this mean Intel PCM doesn't work on my CPU as well?

Thank you. Looking forward for your answer.

0 Kudos
Travis_D_
New Contributor II
18,215 Views

Yes, it isn't supported on your CPU. See:

https://github.com/opcm/pcm/issues/60

You can still get power info from RAPL from some of the other PCM tools or turbostat.

 

0 Kudos
Iulia_S_
Beginner
18,215 Views

Travis D. wrote:

Yes, it isn't supported on your CPU. See:

https://github.com/opcm/pcm/issues/60

You can still get power info from RAPL from some of the other PCM tools or turbostat.

I tried to run:
./pcm-power.x -p 0 -m -1 -- /bin/sleep 5

on a ivy bridge processor and it still doesn't work. Anyway, on ivy bridge I have another output (compared to sandy bridge):

 Processor Counter Monitor  ($Format:%ci ID=%h$)

 Power Monitoring Utility
Error: NMI watchdog is enabled. This consumes one hw-PMU counter
 to disable NMI watchdog please run under root: echo 0 > /proc/sys/kernel/nmi_watchdog
 or to disable it permanently: echo 'kernel.nmi_watchdog=0' >> /etc/sysctl.conf
You need to be root and loaded 'msr' Linux kernel module to execute the program. You may load the 'msr' module with 'modprobe msr'.

So, I run the commands indicated in the output of pcm-power command:
sudo modprobe msr
sudo chmod 777 /etc/sysctl.conf
sudo echo 'kernel.nmi_watchdog=0' >> /etc/sysctl.conf

and then, I run again the pcm-power command, but the output of pcm-power is the same.

Can you please help or recommend another PCM tool to get the cpu power consumption?
Thank you.

Update on turbostat:

turbostat works for me on ivy bridge but, after reading the specification, I am not sure of the meaning of PkgWatt column.
1. Does 
PkgWatt column mean the cpu power consumption of a single cpu or the overall cpu power consumption? As you can see in the output below, there are no records except for cpu 0 for the last four columns. If PkgWatt refers to a single cpu, why there is no record for other cpus besides (cpu 0, core 0) in the columns PkgWatt and CorWatt?
2. What is the meaning of 
CorWatt column? Is it the power consumed by all cores part of a cpu or it is the power consumed by a specific core? If is is the power of a specific core, why there is no record in the last four column for all cores?
3. I see there are some cores missing in the output below. Does this mean they are in idle state and they do not consume active power? Is
PkgWatt referring only to the active power or it includes the idle power as well? If it represents both active and idle power, how can I get power measurements for missing cores as well?

           Core     CPU Avg_MHz   Busy% Bzy_MHz TSC_MHz     SMI  CPU%c1  CPU%c3  CPU%c6  CPU%c7 CoreTmp  PkgTmp PkgWatt CorWatt GFXWatt

              -       -     493   12.64    3898    3498       0   12.64    0.00    0.00   74.72      47      47   21.62   13.74    0.00
              0       0       4    0.11    3894    3498       0   99.89    0.00    0.00    0.00      47      47   21.62   13.74    0.00
              0       4    3897   99.98    3898    3498       0    0.02
              1       1       7    0.17    3887    3498       0    0.04    0.00    0.00   99.79      32
              1       5       0    0.00    3885    3498       0    0.21
              2       2      29    0.76    3895    3498       0    0.10    0.01    0.01   99.13      32
              2       6       2    0.06    3896    3498       0    0.80
              3       3       1    0.02    3832    3498       0    0.03    0.00    0.00   99.95      28
              3       7       0    0.00    3879    3498       0    0.04
 
0 Kudos
Travis_D_
New Contributor II
18,215 Views

About the NMI thing, you'll have to restart if you do it the /etc/sysctl.conf way, you have to restart. Also, those instructions might not work for you depending on your distro (AFAIK, the systemd changes affected this - look up for your distro how to change kernel boot options: I did it in my GRUB configuration).

BTW, chmod 777ing your sysctl.conf file is probably a huge security problem, don't do that...

Just change it temporarily with:

sudo su -
echo 0 > /proc/sys/kernel/nmi_watchdog
exit

OR

echo 0 | sudo tee /proc/sys/kernel/nmi_watchdog

Those are two different ways of doing the same thing.

About turbostat, PkgWatt is "package watts". Package refers to the entire CPU, i.e., what you buy in a in a box or what you see if you look at your motherboard. Most systems only have 1 but a "dual socket" system would have 2, etc. So should include the power consumption for all cores and also parts that don't belong to any core like the memory controller (but not the memory itself) and L3 cache. That's why you see only one 1 line: it's for the whole CPU package and you only have one of those.

CorWat is the power for a particular core on your CPU. You see it only every other row because you have hyperthreading on, and this is measured at the physical core level, so 2 logical cores correspond to 1 physical core and share the measurement.

0 Kudos
Iulia_S_
Beginner
18,215 Views

Travis D. wrote:

CorWat is the power for a particular core on your CPU. You see it only every other row because you have hyperthreading on, and this is measured at the physical core level, so 2 logical cores correspond to 1 physical core and share the measurement.



Thank you. Anyway, I got some new measurements but the output still confuses me. For instance, can you please indicate the value of CorWatt for logical CPUs 1 and 13 of the physical core 1 from measurements below?

usec
Package Core CPU Avg_MHz Busy% Bzy_MHz TSC_MHz IRQ SMI C1 C1E C3 C6 C1% C1E% C3% C6% CPU%c1 CPU%c3 CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 PkgWatt CorWatt RAMWatt PKG_% RAM_%

    0 - - - 520 16.79 3090 2605 3338 0 6 0 2 665 0.03 0.00 0.00 83.28 1.53 0.00 81.68 0.00 43 43 0.00 0.00 0.00 70.88 46.74 8.58 0.00 0.00
 3277 0 0 0 3076 99.20 3088 2610 307 0 0 0 0 0 0.00 0.00 0.00 0.00 0.80 0.00 0.00 0.00 37 37 0.00 0.00 0.00 35.42 23.36 4.99 0.00 0.00
 3990 0 0 12 3076 99.20 3088 2610 325 0 0 0 0 0 0.00 0.00 0.00 0.00 0.80
   95 0 1 1 22 0.72 2979 2610 311 0 4 0 0 301 0.70 0.00 0.00 99.06 8.97 0.00 90.31 0.00 35
   87 0 1 13 2 0.07 2968 2610 9 0 0 0 0 11 0.00 0.00 0.00 100.37 9.62
  121 0 2 2 1 0.03 2901 2610 9 0 0 0 0 10 0.00 0.00 0.00 100.37 0.35 0.00 99.62 0.00 34
   89 0 2 14 1 0.03 2909 2610 10 0 0 0 0 12 0.00 0.00 0.00 100.37 0.35
  115 0 3 3 1 0.03 2905 2610 10 0 0 0 0 12 0.00 0.00 0.00 100.37 0.35 0.00 99.62 0.00 32
   88 0 3 15 1 0.03 2900 2610 9 0 0 0 0 11 0.00 0.00 0.00 100.37 0.35
  112 0 4 4 1 0.04 2910 2610 9 0 0 0 0 10 0.00 0.00 0.00 100.37 2.14 0.00 97.82 0.00 34
   90 0 4 16 5 0.17 2962 2610 71 0 2 0 0 72 0.01 0.00 0.00 100.24 2.01
  114 0 5 5 1 0.03 2900 2610 11 0 0 0 0 13 0.00 0.00 0.00 100.36 0.49 0.00 99.48 0.00 30
   89 0 5 17 1 0.04 2900 2610 12 0 0 0 0 15 0.00 0.00 0.00 100.36 0.48
 6988 1 0 6 3090 99.86 3094 2600 286 0 0 0 0 0 0.00 0.00 0.00 0.00 0.14 0.00 0.00 0.00 43 43 0.00 0.00 0.00 35.46 23.38 3.59 0.00 0.00
 3977 1 0 18 3090 99.86 3095 2600 259 0 0 0 0 0 0.00 0.00 0.00 0.00 0.14
  120 1 1 7 1 0.04 2944 2600 8 0 0 0 0 12 0.00 0.00 0.00 99.98 0.69 0.00 99.27 0.00 41
   98 1 1 19 6 0.20 2907 2600 6 0 0 0 0 19 0.00 0.00 0.00 99.81 0.53
  125 1 2 8 10 0.32 2945 2600 10 0 0 0 1 29 0.00 0.00 0.01 99.67 0.84 0.00 98.84 0.00 35
   88 1 2 20 3 0.09 2904 2600 6 0 0 0 0 19 0.00 0.00 0.00 99.92 1.06
  113 1 3 9 23 0.77 2967 2600 26 0 0 0 1 35 0.00 0.00 0.04 99.19 2.73 0.01 96.48 0.00 36
   90 1 3 21 53 1.79 2990 2600 1605 0 0 0 0 28 0.00 0.00 0.00 98.22 1.72
  114 1 4 10 4 0.14 2937 2600 8 0 0 0 0 13 0.00 0.00 0.00 99.87 0.76 0.00 99.10 0.00 33
   90 1 4 22 5 0.15 2922 2600 18 0 0 0 0 25 0.00 0.00 0.00 99.85 0.75
  112 1 5 11 1 0.03 2901 2600 6 0 0 0 0 8 0.00 0.00 0.00 99.98 0.39 0.00 99.58 0.00 35
   91 1 5 23 3 0.09 2961 2600 7 0 0 0 0 10 0.00 0.00 0.00 99.91 0.32
0 Kudos
Travis_D_
New Contributor II
18,215 Views

It's hard to tell because the formatting is messed up now (not a monospaced font) but it seems that CorWatt is tracked per-package and not per-core, so it's only available once for each of your package. Probably it is the total power used by all the cores on the package, where as PackageWatt would include the uncore.

Based on that I get 35.42W, 23.36W, 4.99W for PkgWatt, CorWatt, RAMWatt respectively for package 0 (just count fields backwards from the end of the line). Those values seem reasonable to me.

 

The nice thing about turbostat is that the code is available, so you can modify it as you wish, including only printing the fields you want:

https://github.com/torvalds/linux/blob/master/tools/power/x86/turbostat/turbostat.c

I had no issue compiling it locally.

You can also check exactly what MSR or RAPL registers the code is reading for the various fields and then look them up in the Intel SDM to get an exact idea of what's going on.

0 Kudos
grayxu
Novice
12,474 Views

s-tui: Terminal-based CPU stress and monitoring utility

0 Kudos
Reply