Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
New Contributor III
87 Views

Intel Xeon Phi Per Core Data

Jump to solution

Hi All,

Is it possible to log following per code data for Intel Xeon Phi 7210 either using sysfs or by reading any specific MSR?

  • Per Core Power
  • Per Core Temperature
  • Per Core Utilization

Thanks.

Chetan Arvind Patil
0 Kudos

Accepted Solutions
Highlighted
Black Belt
87 Views
  • Per core Power -- no
  • Per core Temperature -- almost
    • The MSR that normally provides this information at core scope is IA32_THEM_STATUS (19Ch).
    • The MSR tables in the Intel Architecture SW Developer's manual say that this has "module" scope on Xeon Phi x200, where "module" refers to the pair of cores sharing an L2 cache.  In most documentation this would be referred to as "tile" scope.
  • Per Core Utilization -- yes
    • It should be possible to measure this in three different ways, but the third one looks broken on Xeon Phi x200.
    • All three require that you read the Time-Stamp Counter at the beginning and end of the measurement interval.
    • In addition to the TSC, you need to read any of these three registers at the beginning and end of the measurement interval:
      1. Fixed-function counter 2 (IA32_FIXED_CTR2, MSR 30Bh).
      2. One of the programmable counters measuring the CPU_CLK_UNHALTED.REF (Event 3Ch, Umask 01h).
      3. The IA32_MPERF MSR (E7h).
    • For the first two options, you can get the overall utilization (any thread active) by setting the AnyThread bit in the appropriate counter register -- then you only need to read the TSC and performance counter register using one thread on any physical core.
      1. For Fixed-function counter 2, you need to set bit 10 of IA32_FIXED_CTR_CTRL (38Dh) (in addition to bits 8-9).  Bit 34 of IA32_PERF_GLOBAL_CTRL (38Fh) must also be set.
      2. For a programmable counter, you need to set bit 21 of the IA32_PERFEVTSEL register you are using (either 186h or 187h).
    • For the third option (MPERF), the only option is thread scope, so you would need to read the MPERF register separately for each logical processor in the system.
      • You still only need to read the TSC once per physical core.
      • Unfortunately the APERF and MPERF registers appear to be broken on KNL?
        • On every other processor I have tested, MPERF increments at the same rate as TSC on a 100% busy core.
        • On Xeon Phi x200, both MPERF and APERF increment at slightly less than 1/1000th of the rate that I expect.
          • All of the tests I have done show consistent results -- the counting appears correct, but with the wrong scaling factor.
          • The ratio of delta(APERF)/delta(MPERF) matches the observed Turbo boost.
          • Since the scaling factor is not documented, you can't rely on a computation that involves subtracting a scaled MPERF value from the delta TSC value.
"Dr. Bandwidth"

View solution in original post

0 Kudos
15 Replies
Highlighted
Black Belt
88 Views
  • Per core Power -- no
  • Per core Temperature -- almost
    • The MSR that normally provides this information at core scope is IA32_THEM_STATUS (19Ch).
    • The MSR tables in the Intel Architecture SW Developer's manual say that this has "module" scope on Xeon Phi x200, where "module" refers to the pair of cores sharing an L2 cache.  In most documentation this would be referred to as "tile" scope.
  • Per Core Utilization -- yes
    • It should be possible to measure this in three different ways, but the third one looks broken on Xeon Phi x200.
    • All three require that you read the Time-Stamp Counter at the beginning and end of the measurement interval.
    • In addition to the TSC, you need to read any of these three registers at the beginning and end of the measurement interval:
      1. Fixed-function counter 2 (IA32_FIXED_CTR2, MSR 30Bh).
      2. One of the programmable counters measuring the CPU_CLK_UNHALTED.REF (Event 3Ch, Umask 01h).
      3. The IA32_MPERF MSR (E7h).
    • For the first two options, you can get the overall utilization (any thread active) by setting the AnyThread bit in the appropriate counter register -- then you only need to read the TSC and performance counter register using one thread on any physical core.
      1. For Fixed-function counter 2, you need to set bit 10 of IA32_FIXED_CTR_CTRL (38Dh) (in addition to bits 8-9).  Bit 34 of IA32_PERF_GLOBAL_CTRL (38Fh) must also be set.
      2. For a programmable counter, you need to set bit 21 of the IA32_PERFEVTSEL register you are using (either 186h or 187h).
    • For the third option (MPERF), the only option is thread scope, so you would need to read the MPERF register separately for each logical processor in the system.
      • You still only need to read the TSC once per physical core.
      • Unfortunately the APERF and MPERF registers appear to be broken on KNL?
        • On every other processor I have tested, MPERF increments at the same rate as TSC on a 100% busy core.
        • On Xeon Phi x200, both MPERF and APERF increment at slightly less than 1/1000th of the rate that I expect.
          • All of the tests I have done show consistent results -- the counting appears correct, but with the wrong scaling factor.
          • The ratio of delta(APERF)/delta(MPERF) matches the observed Turbo boost.
          • Since the scaling factor is not documented, you can't rely on a computation that involves subtracting a scaled MPERF value from the delta TSC value.
"Dr. Bandwidth"

View solution in original post

0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

Thanks for detailed information. If not per core, I guess per package power is possible?

  • Below is the log I get after running turbostat. Based on which it seems PkgWatt and RAMWatt are specific MSR read by turbostat, is my understanding correct?
  • Are you aware of any turbostat user guide on what each of the column header mean? I can guess frequency as Bzy_MHZ, but not sure about what CPU%c1 stands for.

Thanks.

  CPU Avg_MHz   %Busy Bzy_MHz TSC_MHz     SMI  CPU%c1  CPU%c6 CoreTmp  PkgTmp Pkg%pc3 Pkg%pc6 PkgWatt RAMWatt   PKG_%   RAM_%
   -       4    0.27    1286    1324       0    0.24   99.49      24      31    0.39   77.82   49.36    2.47    0.00    0.00
   0      12    0.89    1289    1358       0    0.25   98.86      24      31    0.38   75.88   49.36    2.47    0.00    0.00
  64       2    0.15    1240    1358       0    0.99
 128       3    0.25    1273    1358       0    0.89
 192       3    0.26    1274    1359       0    0.89
   1       2    0.19    1210    1359       0    0.21   99.60      24
  65       2    0.15    1221    1358       0    0.24
 129       3    0.26    1254    1358       0    0.14
 193       4    0.27    1252    1358       0    0.12
   2       2    0.19    1239    1358       0    1.04   98.77      21
  66      13    0.94    1289    1358       0    0.29
 130       2    0.17    1252    1359       0    1.07
 194       5    0.34    1276    1359       0    0.90
   3       3    0.24    1220    1357       0    0.25   99.51      21
  67       2    0.17    1214    1357       0    0.32
 131       4    0.30    1253    1357       0    0.19
 195       4    0.29    1256    1357       0    0.19
   4       4    0.27    1246    1357       0    0.14   99.59      21
  68       2    0.15    1244    1357       0    0.26
 132       2    0.17    1261    1357       0    0.25
 196       3    0.24    1274    1357       0    0.17
   5       4    0.27    1254    1357       0    0.16   99.57      20
  69       2    0.16    1228    1357       0    0.27
 133       2    0.16    1239    1357       0    0.27
 197       4    0.27    1262    1357       0    0.17
   6       3    0.24    1259    1357       0    0.29   99.47      22
  70       2    0.16    1242    1357       0    0.37
 134       3    0.24    1269    1356       0    0.18
 198       3    0.25    1270    1356       0    0.17
   7       4    0.28    1251    1356       0    0.19   99.52      22
  71       2    0.18    1231    1356       0    0.30
 135       4    0.31    1267    1356       0    0.16
 199       4    0.33    1267    1356       0    0.14
   8       3    0.26    1268    1354       0    0.23   99.50      22
  72       3    0.26    1268    1354       0    0.24
 136       3    0.25    1276    1352       0    0.13
 200       3    0.25    1276    1352       0    0.14
   9       2    0.16    1223    1352       0    0.30   99.53      22
  73       3    0.22    1254    1352       0    0.24
 137       3    0.26    1264    1352       0    0.21
 201       3    0.23    1259    1351       0    0.13
  10       2    0.16    1245    1351       0    0.27   99.57      23
  74       2    0.14    1247    1351       0    0.29
 138       3    0.25    1276    1351       0    0.18
 202       2    0.16    1258    1351       0    0.27
  11       5    0.39    1267    1349       0    0.13   99.48      23
  75       4    0.28    1266    1349       0    0.24
 139       5    0.36    1272    1349       0    0.16
 203       2    0.15    1240    1349       0    0.37
  12       3    0.23    1262    1346       0    0.36   99.41      19
  76       3    0.25    1272    1345       0    0.26
 140       2    0.15    1258    1343       0    0.24
 204       3    0.21    1270    1343       0    0.18
  13       2    0.17    1220    1343       0    0.32   99.51      19
  77       4    0.28    1264    1343       0    0.21
 141       2    0.17    1247    1343       0    0.32
 205       3    0.24    1258    1342       0    0.13
  14       2    0.15    1243    1342       0    0.63   99.22      22
  78       2    0.14    1249    1340       0    0.55
 142       4    0.34    1281    1340       0    0.35
 206       3    0.25    1274    1339       0    0.36
  15       3    0.25    1250    1338       0    0.29   99.46      22
  79       4    0.28    1262    1337       0    0.20
 143       4    0.31    1266    1337       0    0.17
 207       4    0.31    1271    1337       0    0.17
  16       2    0.14    1241    1337       0    0.30   99.55      22
  80       3    0.24    1273    1337       0    0.21
 144       3    0.14    2417    1337       0    0.31
 208       3    0.22    1271    1336       0    0.11
  17       4    0.29    1258    1336       0    0.21   99.49      22
  81       3    0.21    1255    1336       0    0.30
 145       4    0.28    1271    1335       0    0.15
 209       4    0.28    1262    1335       0    0.15
  18       3    0.24    1264    1335       0    0.36   99.40      21
  82       3    0.26    1276    1334       0    0.26
 146       2    0.15    1259    1334       0    0.37
 210       3    0.23    1273    1332       0    0.17
  19       2    0.19    1234    1332       0    0.32   99.49      22
  83       3    0.24    1256    1332       0    0.27
 147       4    0.19    2005    1332       0    0.32
 211       3    0.24    1261    1331       0    0.16
  20       4    0.33    1264    1331       0    0.18   99.49      21
  84       5    0.36    1277    1331       0    0.14
 148       4    0.34    1274    1331       0    0.17
 212       5    0.17    2891    1331       0    0.34
  21       3    0.22    1216    1328       0    0.19   99.59      21
  85       3    0.26    1242    1328       0    0.15
 149       3    0.24    1241    1328       0    0.17
 213       3    0.21    1191    1328       0    0.20
  22       2    0.17    1247    1327       0    0.23   99.61      21
  86       3    0.26    1274    1327       0    0.13
 150       2    0.14    1249    1327       0    0.25
 214       3    0.26    1275    1327       0    0.14
  23       2    0.16    1224    1327       0    0.27   99.57      21
  87       3    0.27    1266    1327       0    0.16
 151       4    0.32    1269    1327       0    0.11
 215       4    0.31    1271    1327       0    0.12
  24       3    0.16    1992    1327       0    0.30   99.54      22
  88       2    0.14    1256    1326       0    0.23
 152       3    0.26    1275    1325       0    0.10
 216       3    0.25    1274    1326       0    0.11
  25       2    0.15    1215    1326       0    0.23   99.62      22
  89       2    0.14    1231    1326       0    0.24
 153       3    0.26    1262    1326       0    0.12
 217       3    0.26    1261    1326       0    0.12
  26       3    0.15    2002    1326       0    0.48   99.36      21
  90       2    0.14    1255    1324       0    0.41
 154       3    0.21    1269    1323       0    0.24
 218       3    0.20    1266    1322       0    0.17
  27       4    0.32    1260    1322       0    0.18   99.49      21
  91       2    0.14    1267    1322       0    0.36
 155       4    0.31    1269    1322       0    0.19
 219       4    0.32    1271    1322       0    0.18
  28       6    0.46    1255    1322       0    0.25   99.29      21
  92       5    0.37    1278    1322       0    0.34
 156       5    0.38    1280    1322       0    0.34
 220       5    0.37    1277    1322       0    0.35
  29       3    0.21    1226    1322       0    0.26   99.53      21
  93       2    0.17    1235    1322       0    0.30
 157       4    0.28    1260    1322       0    0.19
 221       3    0.27    1256    1322       0    0.20
  30       2    0.16    1247    1322       0    0.36   99.48      23
  94       5    0.35    1281    1322       0    0.17
 158       4    0.34    1282    1322       0    0.18
 222       5    0.15    3099    1322       0    0.38
  31       2    0.18    1231    1320       0    0.22   99.59      22
  95       2    0.17    1246    1319       0    0.24
 159       4    0.27    1265    1319       0    0.13
 223       4    0.28    1266    1319       0    0.13
  32       2    0.18    1249    1319       0    0.35   99.48      22
  96       4    0.34    1281    1319       0    0.18
 160       4    0.34    1282    1319       0    0.18
 224       5    0.35    1281    1319       0    0.17
  33       5    0.40    1269    1319       0    0.13   99.47      21
  97       5    0.39    1275    1319       0    0.14
 161       5    0.39    1275    1319       0    0.14
 225       5    0.39    1274    1319       0    0.14
  34       2    0.17    1249    1320       0    0.33   99.50      21
  98       5    0.36    1284    1320       0    0.14
 162       5    0.36    1281    1320       0    0.15
 226       5    0.36    1283    1320       0    0.15
  35       5    0.38    1268    1320       0    0.11   99.51      21
  99       2    0.17    1241    1320       0    0.32
 163       5    0.37    1275    1320       0    0.13
 227       5    0.37    1273    1320       0    0.13
  36       5    0.42    1280    1317       0    0.14   99.44      21
 100       5    0.39    1284    1317       0    0.17
 164       5    0.39    1285    1317       0    0.17
 228       5    0.38    1284    1317       0    0.18
  37       4    0.30    1256    1317       0    0.19   99.50      21
 101       2    0.16    1237    1316       0    0.25
 165       4    0.28    1265    1316       0    0.13
 229       4    0.28    1265    1316       0    0.13
  38       5    0.41    1279    1316       0    0.14   99.44      21
 102       5    0.39    1285    1316       0    0.16
 166       5    0.38    1284    1316       0    0.17
 230       5    0.40    1263    1316       0    0.16
  39       3    0.27    1254    1316       0    0.35   99.39      21
 103       2    0.16    1239    1315       0    0.37
 167       2    0.17    1246    1315       0    0.35
 231       4    0.29    1266    1313       0    0.15
  40       2    0.19    1253    1312       0    0.32   99.49      21
 104       5    0.35    1283    1312       0    0.15
 168       5    0.36    1282    1312       0    0.15
 232       5    0.35    1281    1312       0    0.16
  41       2    0.19    1237    1312       0    0.24   99.57      21
 105       2    0.17    1246    1312       0    0.26
 169       4    0.28    1262    1312       0    0.15
 233       3    0.27    1267    1312       0    0.16
  42       2    0.17    1244    1311       0    0.30   99.53      23
 106       2    0.14    1308    1311       0    0.33
 170       3    0.26    1273    1311       0    0.20
 234       3    0.27    1274    1311       0    0.19
  43       5    0.38    1269    1311       0    0.13   99.49      23
 107       2    0.15    1240    1311       0    0.35
 171       5    0.37    1275    1311       0    0.13
 235       5    0.37    1277    1311       0    0.14
  44       2    0.19    1255    1311       0    0.34   99.47      21
 108       5    0.35    1282    1311       0    0.15
 172       5    0.36    1281    1311       0    0.15
 236       5    0.35    1283    1311       0    0.16
  45       4    0.28    1257    1311       0    0.21   99.51      21
 109       2    0.17    1242    1310       0    0.23
 173       4    0.28    1268    1310       0    0.12
 237       4    0.29    1268    1310       0    0.11
  46       2    0.17    1244    1310       0    0.33   99.51      21
 110       5    0.35    1281    1310       0    0.14
 174       4    0.35    1280    1310       0    0.15
 238       4    0.34    1282    1310       0    0.15
  47       2    0.17    1227    1310       0    0.28   99.55      21
 111       2    0.16    1242    1310       0    0.28
 175       4    0.28    1267    1310       0    0.16
 239       3    0.26    1266    1310       0    0.18
  48       3    0.23    1262    1310       0    0.33   99.44      23
 112       4    0.34    1278    1310       0    0.22
 176       4    0.33    1281    1310       0    0.23
 240       4    0.34    1281    1310       0    0.22
  49       5    0.36    1263    1307       0    0.13   99.51      23
 113       2    0.16    1238    1307       0    0.33
 177       5    0.35    1272    1307       0    0.14
 241       4    0.35    1272    1307       0    0.14
  50       4    0.29    1259    1307       0    0.33   99.38      20
 114       5    0.36    1278    1307       0    0.25
 178       5    0.36    1278    1307       0    0.25
 242       5    0.36    1275    1307       0    0.25
  51       5    0.40    1254    1307       0    0.13   99.47      20
 115       2    0.19    1225    1307       0    0.34
 179       5    0.39    1264    1307       0    0.14
 243       5    0.38    1264    1307       0    0.14
  52       2    0.20    1240    1307       0    0.54   99.26      24
 116       7    0.58    1285    1307       0    0.16
 180       5    0.37    1277    1307       0    0.37
 244       5    0.37    1278    1307       0    0.37
  53       3    0.21    1221    1307       0    0.31   99.49      24
 117       5    0.39    1270    1307       0    0.13
 181       5    0.39    1271    1307       0    0.12
 245       5    0.19    2625    1307       0    0.32
  54       2    0.19    1246    1304       0    0.27   99.54      21
 118       4    0.33    1276    1305       0    0.14
 182       4    0.29    1277    1305       0    0.17
 246       4    0.33    1277    1305       0    0.14
  55       3    0.26    1248    1305       0    0.28   99.46      22
 119       3    0.27    1259    1304       0    0.20
 183       4    0.33    1269    1304       0    0.13
 247       4    0.32    1267    1304       0    0.14
  56       3    0.23    1252    1304       0    0.32   99.45      24
 120       4    0.29    1273    1304       0    0.26
 184       4    0.28    1271    1304       0    0.27
 248       2    0.20    1260    1304       0    0.35
  57       2    0.20    1217    1304       0    0.27   99.53      24
 121       2    0.20    1236    1303       0    0.27
 185       4    0.31    1256    1303       0    0.16
 249       2    0.18    1230    1302       0    0.16
  58       2    0.17    1237    1302       0    0.24   99.59      22
 122       2    0.15    1250    1302       0    0.26
 186       3    0.26    1273    1302       0    0.14
 250       3    0.25    1274    1302       0    0.15
  59       4    0.32    1255    1302       0    0.17   99.52      22
 123       2    0.17    1241    1302       0    0.32
 187       4    0.33    1268    1302       0    0.16
 251       4    0.34    1270    1302       0    0.15
  60       3    0.21    1260    1302       0    0.31   99.48      22
 124       4    0.32    1276    1302       0    0.19
 188       4    0.29    1272    1302       0    0.23
 252       4    0.32    1273    1302       0    0.20
  61       2    0.18    1211    1302       0    0.25   99.57      22
 125       2    0.17    1231    1302       0    0.26
 189       4    0.28    1259    1302       0    0.15
 253       4    0.28    1254    1302       0    0.15
  62       3    0.21    1240    1302       0    0.31   99.47      23
 126       4    0.35    1275    1302       0    0.18
 190       4    0.35    1275    1302       0    0.18
 254       4    0.35    1274    1302       0    0.17
  63       3    0.26    1223    1302       0    0.33   99.41      23
 127       2    0.17    1226    1302       0    0.41
 191       4    0.31    1257    1302       0    0.27
 255       6    0.33    1856    1302       0    0.26
Chetan Arvind Patil
0 Kudos
Highlighted
Black Belt
87 Views

I have not looked at the source code for the "turbostat" utility, but most of the columns are straightforward.

The %Busy is usually computed by looking at the elapsed "Reference Cycles Not Halted" counter divided by the elapsed TSC cycles.

Adding in the "Actual Cycles Not Halted" counter enables the computation of the average frequency while not halted.

SMI is "System Management Interrupts".  This is the mechanism that a BIOS uses to monitor a processor, but many machines don't use it after the processor is booted.

CPU%c1 is the fraction of time that this core is in the C1 idle state.   There is some discussion of these states in Volume 3 of the Intel Architecture Software Developer's Manual (document 325384), particularly in Section 8.10 "Management of Block and Idle Conditions", and in Chapter 14 "Power and Thermal Management" (especially Section 14.6).

CPU%c6 is analogous to CPU%c1, but refers to a "deeper" idle state -- lower power consumption, but longer wake-up latency.

Pkg%pc3 and Pkg%pc6 report the amount of time in two of the "Package C-states".  These are reduced-power states that can only be entered when no cores are active.

PkgWatt and RAMWatt come from the RAPL system, described in Section 14.9 of Volume 3 of the Software Developer's Manual.

"Dr. Bandwidth"
0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

Thank you.

Regarding RAPL: PAPI has powercap feature and few paper have done validation of RAPL data with that of PAPI's powercap. Are these two system, collecting data from same register? 

If yes, then what's the novelty by using PAPI's powercap, if RAPL can give same data?

Thanks.

Chetan Arvind Patil
0 Kudos
Highlighted
New Contributor III
87 Views

McCalpin, John wrote:

  • Per Core Utilization -- yes
    • It should be possible to measure this in three different ways, but the third one looks broken on Xeon Phi x200.
    • All three require that you read the Time-Stamp Counter at the beginning and end of the measurement interval.
    • In addition to the TSC, you need to read any of these three registers at the beginning and end of the measurement interval:
      1. Fixed-function counter 2 (IA32_FIXED_CTR2, MSR 30Bh).
      2. One of the programmable counters measuring the CPU_CLK_UNHALTED.REF (Event 3Ch, Umask 01h).
      3. The IA32_MPERF MSR (E7h).

After looking into turbostat code [1] it seems it uses different MSR for utilization. Following is the snippet of utilization code. I am also not sure by only c1, c3, c6 and c7 C-state are present. I thought c0 is the default where core run as the maximum resources possible. Any suggestions?

Can you also please point me to any other reference code that reads core utilization using MSR?

[1] Turbostat source: https://github.com/torvalds/linux/blob/master/tools/power/x86/turbostat/turbostat.c

	if (DO_BIC(BIC_CPU_c1) && use_c1_residency_msr) {
		if (get_msr(cpu, MSR_CORE_C1_RES, &t->c1))
			return -6;
	}

	if (DO_BIC(BIC_CPU_c3) && !do_slm_cstates && !do_knl_cstates) {
		if (get_msr(cpu, MSR_CORE_C3_RESIDENCY, &c->c3))
			return -6;
	}

	if (DO_BIC(BIC_CPU_c6) && !do_knl_cstates) {
		if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6))
			return -7;
	} else if (do_knl_cstates) {
		if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6))
			return -7;
	}

	if (DO_BIC(BIC_CPU_c7))
		if (get_msr(cpu, MSR_CORE_C7_RESIDENCY, &c->c7))
			return -8;

 

Chetan Arvind Patil
0 Kudos
Highlighted
Black Belt
87 Views

The code quoted above is for splitting up the idle (non-C0) time into different categories.  The definition of "utilization" that I have been assuming is C0 (active) time / wall time -- without looking at the details of the various idle categories. 

I don't know if there is any guarantee that the MSR_CORE_C*_RESIDENCY registers capture *all* of the cycles that are not in C0, so it seems safer to measure C0 time by "reference cycles not halted" rather than by subtracting the sum of the other C-states from the elapsed time.

The different C-states are primarily important if you are looking at power consumption on systems that are mostly idle, or if you are concerned about core "wake-up" times -- higher-numbered C-states use less power, but take longer to "wake up" (shift to C0/Active).

"Dr. Bandwidth"
0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

McCalpin, John wrote:

The code quoted above is for splitting up the idle (non-C0) time into different categories.  The definition of "utilization" that I have been assuming is C0 (active) time / wall time -- without looking at the details of the various idle categories. 

Yes. This is also what I am looking for: C0 when CPU is fully utilised. If I am correct, then ARM architectures also use the same formula as you descriced above when the ondemand governor is running. I am stuck in the process of how to capture this C0 state utilization correctly. 

May be this is where turbostat is capturing this C0 as %Busy: https://github.com/torvalds/linux/blob/master/tools/power/x86/turbostat/turbostat.c#L834

if (DO_BIC(BIC_Busy))
		outp += sprintf(outp, "%s%.2f", (printed++ ? delim : ""), 100.0 * t->mperf/tsc);

Thanks.

Chetan Arvind Patil
0 Kudos
Highlighted
Black Belt
87 Views

The "mperf/tsc" term is the same as my "third option" in note #2 above.   This should be a correct measure of the fraction of the time that the core is active (C0), but the mperf values on my KNL processors are about 1000x smaller than I expect them to be, so I don't use this. 

More precisely, on my KNL processors, the ratio of APERF to MPERF appears correct for computing the average frequency, but MPERF increments at 1/1024th of the rate of the TSC and APERF increments at 1/1024th of the rate of the CPU_CYCLES_UNHALTED.CORE counter.

Given this discrepancy, computing "fraction of time active" as "delta mperf / delta tsc" will be completely wrong -- with a maximum value of 1/1024 if the processor is actually active 100% of the time.

 

"Dr. Bandwidth"
0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

If the mperf values are smaller than expected, then I am not sure why based on my analysis here, the values for %Busy are expected based on the type of workload I am running. I also validate it based on how I am mapping workload by varying the number of threads using KMP_HW_SUBSET environment variable.

Thanks.

Chetan Arvind Patil
0 Kudos
Highlighted
Black Belt
87 Views

Turbostat includes a magic factor of 1024 for the MPERF and APERF counts in KNL.   Look for "get_aperf_mperf_multiplier" in the turbostat source code (line 3822).  That code returns "1" unless the processor is a KNL, in which case it returns 1024, and all MPERF and APERF deltas are multiplied by this value -- e.g., lines 1585-1586.

Intel's documentation makes it clear that MPERF increments at a rate *proportional to* the TSC, and that the constant of proportionality is neither guaranteed nor meaningful.  

I have no idea where the Linux folks found documentation of the factor of 1024 -- I have a lot of KNL documents and I don't see it anywhere obvious in these....

"Dr. Bandwidth"
0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

McCalpin, John wrote:

I have no idea where the Linux folks found documentation of the factor of 1024 -- I have a lot of KNL documents and I don't see it anywhere obvious in these....

That's because the code is written and maintained by Intel guy and for sure can get accurate architecture details internally, than provided by the KNL documents in public domain. Hence, I am considering this code as a reliable reference.

Thanks.

Chetan Arvind Patil
0 Kudos
Highlighted
New Contributor III
87 Views

Hi John,

Can I not log per core CPU utilization similar to how EDC and MC counters are done using perf, I shared about in different threads on this forum?

I will log basic counters and do post processing to get per core utilization. On Volume 1 Intel PMU for Xeon Phi, I haven't got counters related to MPERF and APERF yet.

Thanks.

Chetan Arvind Patil
0 Kudos
Highlighted
Black Belt
87 Views

APERF and MPERF are not considered "performance monitoring" facilities, so they are not documented in the PMU guide for Xeon Phi.   They are documented along with the other MSRs in Volume 4 of the Intel Architectures SW Developer's Manual (document 335592).  The entry for Xeon Phi makes no comment about the unusual scaling factor, but if you only compute ratios of APERF to MPERF that factor disappears.

"Dr. Bandwidth"
0 Kudos
Highlighted
Beginner
87 Views

i wanted to check a couple of related questions

On my 7120 Xeon Phi, I've looked at values in 

/sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj

/sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:0/energy_uj

I noted two things:

(a) both files have monotonously increasing values (until integer roof then reset) IRRESPECTIVE of work load - this is not what I would expect

(b) that I don't know what they are measuring. I had thought socket and maybe "just the cores but nothing more" but giving the integer wraparound it's not clear whether either is higher than the other

 

advice welcome. 

i do not have 'root'/'sudo' in order to try other metrics

 

yours, michael

 

0 Kudos
Highlighted
Beginner
87 Views

example output of

#!/bin/bash +x

echo 'node info'
grep 'model name' /proc/cpuinfo | sort |uniq -c


POWERCAPDIR=/sys/class/powercap
FILE=energy_uj

SOCKET=${POWERCAPDIR}/intel-rapl/intel-rapl\:0/${FILE} 
SUBSOCK=${POWERCAPDIR}/intel-rapl/intel-rapl\:0/intel-rapl\:0\:0/${FILE}

ls -l ${SOCKET}; cat ${SOCKET}
ls -l ${SUBSOCK}; cat ${SUBSOCK}

SEC=10
echo reporting energy consumed to date for next $SEC seconds
for k in `seq 1 $SEC`; do 
  D=`date +%s.%N`
  SOCK=`cat $SOCKET`
  SUB=`cat $SUBSOCK`
  echo $D $SOCK $SUB
  sleep 1
done

is:

node info
    256 model name      : Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz
-rw-r--r-- 1 root root 4096 Jul 24 14:56 /sys/class/powercap/intel-rapl/intel-rapl:0/energy_uj
98731464531
-rw-r--r-- 1 root root 4096 Jul 24 14:56 /sys/class/powercap/intel-rapl/intel-rapl:0/intel-rapl:0:0/energy_uj
35078540020
reporting energy consumed to date for next 10 seconds
1532440709.313969521 98733719409 35078650562
1532440710.333325570 98815099449 35083995372
1532440711.352328266 98896390074 35089267064
1532440712.370166839 98977634250 35094539153
1532440713.388462036 99058875070 35099809345
1532440714.406850755 99140118331 35105081205
1532440715.425177592 99221438619 35110366468
1532440716.443695284 99302658382 35115625659
1532440717.462236434 99383915925 35120916139
1532440718.480648183 99465147163 35126175973

 

0 Kudos