Software Tuning, Performance Optimization & Platform Monitoring
Discussion regarding monitoring and software tuning methodologies, Performance Monitoring Unit (PMU) of Intel microprocessors, and platform updating.

Intel Performance Counter Monitor - Can't access PCI configuration space

Prabhu_T_
Beginner
3,357 Views

Hi,

I'm trying to run the latest PCM 2.5 on a machine with Intel(R) Xeon(R) CPU E5-2670 (Sandy Bridge), but PCM gives the below message and do not show any memory read/write access in the report.

Can not access SNB-EP (Jaketown) PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these SNB-EP counters in PCM

Can you please guide me what am I missing here.

1. I'm running PCM as root and also looked at the DELL BIOS options for performance counters related to PCI but could not find any to enable or disable

Thanks, Prabhu

0 Kudos
32 Replies
Roman_D_Intel
Employee
861 Views

Prabhu,

thanks for your cooperation and the data. I have prepared a new patch (debug version) that you can apply on top of the old one. Could you please do me a favor and attach this debug output to your reply? This will help us to debug the problem better.

[bash]

patch < patch2.txt

make clean

make

./pcm.x "sleep 1"

[/bash]

Thanks,

Roman

0 Kudos
Prabhu_T_
Beginner
861 Views

Roman,

Please find the attached output of PCM after applying the new patch(patch2). But I could not find any of your debug info in the output.I have looked at cpucounter.cpp,  I saw the debug related code added in function initSocket2Bus,Is !socket2bus.empty() at line 2630 returning always true ? And the code below is not exercised ?

======

~/PCM2.5 # ./pcm.x "sleep 1"

Intel(r) Performance Counter Monitor V2.5 (2013-06-04 11:44:11 +0200 ID=23faaad)

Copyright (c) 2009-2012 Intel Corporation

Num logical cores: 16
Num sockets: 1
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2600000000 Hz
Package thermal spec power: 115 Watt; Package minimum power: 51 Watt; Package maximum power: 180 Watt;
Can not access SNB-EP (Jaketown) PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these SNB-EP counters in PCM.
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.

Detected Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz "Intel(r) microarchitecture codename Sandy Bridge-EP/Jaketown"

Executing "sleep 1" command:

Exit code: 0


EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature


Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP

0 0 0.00 0.28 0.00 1.15 2 8001 1.00 0.30 0.00 0.08 N/A N/A 58
1 0 0.00 0.46 0.00 1.15 4 7871 1.00 0.19 0.00 0.18 N/A N/A 59
2 0 0.00 0.48 0.00 1.15 1 8317 1.00 0.16 0.00 0.19 N/A N/A 66
3 0 0.00 0.48 0.00 1.15 11 8962 1.00 0.17 0.00 0.19 N/A N/A 57
4 0 0.00 0.44 0.00 1.15 2 8309 1.00 0.18 0.00 0.17 N/A N/A 59
5 0 0.00 0.43 0.00 1.15 15 8041 1.00 0.17 0.00 0.16 N/A N/A 56
6 0 0.00 0.43 0.00 1.15 37 9184 1.00 0.21 0.00 0.16 N/A N/A 57
7 0 0.00 0.56 0.00 1.15 1568 19 K 0.92 0.40 0.05 0.15 N/A N/A 54
8 0 0.00 0.28 0.00 1.15 4 1317 1.00 0.60 0.00 0.03 N/A N/A 58
9 0 0.00 0.40 0.00 1.15 1 897 1.00 0.67 0.00 0.03 N/A N/A 59
10 0 0.00 0.46 0.00 1.15 4 3019 1.00 0.41 0.00 0.08 N/A N/A 66
11 0 0.00 0.40 0.00 1.15 7 1288 0.99 0.59 0.00 0.05 N/A N/A 57
12 0 0.00 0.39 0.00 1.15 2 1822 1.00 0.52 0.00 0.06 N/A N/A 59
13 0 0.00 0.33 0.00 1.15 0 1207 1.00 0.54 0.00 0.03 N/A N/A 56
14 0 0.00 0.31 0.00 1.15 0 2155 1.00 0.43 0.00 0.05 N/A N/A 57
15 0 0.00 0.52 0.00 1.15 535 5644 0.91 0.47 0.04 0.09 N/A N/A 54
-------------------------------------------------------------------------------------------------------------------
SKT 0 0.00 0.43 0.00 1.15 2193 95 K 0.98 0.32 0.01 0.11 0.00 0.00 54
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.43 0.00 1.15 2193 95 K 0.98 0.32 0.01 0.11 0.00 0.00 N/A

Instructions retired: 14 M ; Active cycles: 33 M ; Time (TSC): 2613 Mticks ; C0 (active,non-halted) core residency: 0.07 %

C1 core residency: 99.93 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %

PHYSICAL CORE IPC : 0.85 => corresponds to 21.29 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.00 => corresponds to 0.02 % core utilization over time interval
----------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------
SKT 0 package consumed 45.71 Joules
----------------------------------------------------------------------------------------------
TOTAL: 45.71 Joules

----------------------------------------------------------------------------------------------
SKT 0 DIMMs consumed 9.78 Joules
----------------------------------------------------------------------------------------------
TOTAL: 9.78 Joules
Cleaning up
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.

====

Thanks, Prabhu

0 Kudos
Roman_D_Intel
Employee
861 Views

Prabhu,

the initSocket2Bus() is called from JKT_Uncore_Pci::JKT_Uncore_Pci which throws an error exception and also has debug output before throwing it. I dont see it either.

Is there a chance that you did not recompile the pcm.x binary after patching? Please do

[bash]

make clean

make

[/bash]

Thanks,

Roman

0 Kudos
Roman_D_Intel
Employee
861 Views

Also, out of the curiosuty what is the type of Dell system do you have? Is it "DELL WORKSTATION T7600" ?

0 Kudos
Prabhu_T_
Beginner
861 Views

Roman Dementiev (Intel) wrote:

Prabhu,

the initSocket2Bus() is called from JKT_Uncore_Pci::JKT_Uncore_Pci which throws an error exception and also has debug output before throwing it. I dont see it either.

Is there a chance that you did not recompile the pcm.x binary after patching? Please do

make clean make

Thanks,

Roman

I have recompiled pcm.x, the output I have send earlier is based on the new pcm.x with debug patch.

Thanks, Prabhu

0 Kudos
Prabhu_T_
Beginner
861 Views

Roman Dementiev (Intel) wrote:

Also, out of the curiosuty what is the type of Dell system do you have? Is it "DELL WORKSTATION T7600" ?

The machine is a Dell - External OEMR XL R720

Thanks, Prabhu

0 Kudos
Roman_D_Intel
Employee
861 Views

Prabhu,

I saw the debug related code added in function initSocket2Bus,Is !socket2bus.empty() at line 2630 returning always true ? And the code below is not exercised

since you are running single socket you can remove the line with the "if(!socket2bus.empty()) return;". However, I don't expect this to help...

Did you see any error messages when patching? Could you please attach your patched cpucounters.cpp to your reply for me to check?

Thanks,

Roman

0 Kudos
Prabhu_T_
Beginner
861 Views

Roman,

I haven't find any errors when patching cpucounters.cpp with patch2.txt. Please find the attached cpucounters.cpp with applied patch.

Thanks,

Prabhu

0 Kudos
Roman_D_Intel
Employee
861 Views

Prabhu,

we have just released new Intel PCM 2.5.1 (www.intel.com/software/pcm). Could you please try it and attach its output to your reply?

Thanks,

Roman

0 Kudos
Prabhu_T_
Beginner
861 Views

Hi Roman,

THe new Intel PCM 2.6 also does not show the memory bandwidth. It reports the below error. I'm running as root.

Can not access Jaketown/Ivytown PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these Jaketown/Ivytown counters in PCM.

 Thanks, Prabhu

0 Kudos
Roman_D_Intel
Employee
861 Views
Prabhu, Thank you for letting us know. Could you please post the full output from 2.6 version? Roman
0 Kudos
Prabhu_T_
Beginner
861 Views

Roman,

Please find the full output:

 

Intel(r) Performance Counter Monitor V2.6 (2013-11-04 13:43:31 +0100 ID=db05e43)

Copyright (c) 2009-2013 Intel Corporation

Number of physical cores: 8

Number of logical cores: 16

Threads (logical cores) per physical core: 2

Num sockets: 1

Core PMU (perfmon) version: 3

Number of core PMU generic (programmable) counters: 4

Width of generic (programmable) counters: 48 bits

Number of core PMU fixed counters: 3

Width of fixed counters: 48 bits

Nominal core frequency: 2600000000 Hz

Package thermal spec power: 115 Watt; Package minimum power: 51 Watt; Package maximum power: 180 Watt;

Can not access Jaketown/Ivytown PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.

You must be root to access these Jaketown/Ivytown counters in PCM.

Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune

You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.

Detected Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz "Intel(r) microarchitecture codename Sandy Bridge-EP/Jaketown"

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.44   0.01    1.15      48 K    802 K    0.94    0.47    0.04    0.14    0.00    0.00     N/A

Instructions retired:  100 M ; Active cycles:  228 M ; Time (TSC): 2604 Mticks ; C0 (active,non-halted) core residency: 0.47 %

C1 core residency: 99.53 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.88 => corresponds to 21.91 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.00 => corresponds to 0.12 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.48   0.01    1.15      71 K    997 K    0.93    0.51    0.04    0.12    0.00    0.00     N/A

Instructions retired:  144 M ; Active cycles:  302 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.63 %

C1 core residency: 99.37 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.95 => corresponds to 23.81 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.17 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.45   0.01    1.15      46 K    826 K    0.94    0.49    0.03    0.13    0.00    0.00     N/A

Instructions retired:  108 M ; Active cycles:  243 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.51 %

C1 core residency: 99.49 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.89 => corresponds to 22.34 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.13 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.46   0.01    1.15      54 K    885 K    0.94    0.49    0.04    0.13    0.00    0.00     N/A

Instructions retired:  117 M ; Active cycles:  257 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.53 %

C1 core residency: 99.47 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.91 => corresponds to 22.81 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.14 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.47   0.01    1.15      61 K    965 K    0.94    0.51    0.04    0.12    0.00    0.00     N/A

Instructions retired:  135 M ; Active cycles:  288 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.60 %

C1 core residency: 99.40 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.94 => corresponds to 23.45 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.16 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.50   0.01    1.15      83 K   1055 K    0.92    0.57    0.05    0.12    0.00    0.00     N/A

Instructions retired:  159 M ; Active cycles:  320 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.66 %

C1 core residency: 99.34 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 1.00 => corresponds to 24.95 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.19 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.52   0.01    1.15     106 K   1105 K    0.90    0.65    0.06    0.12    0.00    0.00     N/A

Instructions retired:  167 M ; Active cycles:  325 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.67 %

C1 core residency: 99.33 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 1.03 => corresponds to 25.75 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.20 % core utilization over time interval

----------------------------------------------------------------------------------------------

EXEC  : instructions per nominal CPU cycle

IPC   : instructions per CPU cycle

FREQ  : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)

AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state'  (includes Intel Turbo Boost)

L3MISS: L3 cache misses

L2MISS: L2 cache misses (including other core's L2 cache *hits*)

L3HIT : L3 cache hit ratio (0.00-1.00)

L2HIT : L2 cache hit ratio (0.00-1.00)

L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency

L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)

READ  : bytes read from memory controller (in GBytes)

WRITE : bytes written to memory controller (in GBytes)

TEMP  : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature

 

Core (SKT) | EXEC | IPC  | FREQ  | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK  | READ  | WRITE | TEMP

-------------------------------------------------------------------------------------------------------------------

TOTAL  *     0.00   0.49   0.01    1.15     102 K   1259 K    0.92    0.64    0.05    0.12    0.00    0.00     N/A

Instructions retired:  180 M ; Active cycles:  370 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.77 %

C1 core residency: 99.23 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;

C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;

PHYSICAL CORE IPC                 : 0.98 => corresponds to 24.42 % utilization for cores in active state

Instructions per nominal CPU cycle: 0.01 => corresponds to 0.22 % core utilization over time interval

----------------------------------------------------------------------------------------------

Cleaning up

Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune

You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.

0 Kudos
Reply