- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'm trying to run the latest PCM 2.5 on a machine with Intel(R) Xeon(R) CPU E5-2670 (Sandy Bridge), but PCM gives the below message and do not show any memory read/write access in the report.
Can not access SNB-EP (Jaketown) PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these SNB-EP counters in PCM
Can you please guide me what am I missing here.
1. I'm running PCM as root and also looked at the DELL BIOS options for performance counters related to PCI but could not find any to enable or disable
Thanks, Prabhu
Link Copied
- « Previous
-
- 1
- 2
- Next »
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Prabhu,
thanks for your cooperation and the data. I have prepared a new patch (debug version) that you can apply on top of the old one. Could you please do me a favor and attach this debug output to your reply? This will help us to debug the problem better.
[bash]
patch < patch2.txt
make clean
make
./pcm.x "sleep 1"
[/bash]
Thanks,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Roman,
Please find the attached output of PCM after applying the new patch(patch2). But I could not find any of your debug info in the output.I have looked at cpucounter.cpp, I saw the debug related code added in function initSocket2Bus,Is !socket2bus.empty() at line 2630 returning always true ? And the code below is not exercised ?
======
~/PCM2.5 # ./pcm.x "sleep 1"
Intel(r) Performance Counter Monitor V2.5 (2013-06-04 11:44:11 +0200 ID=23faaad)
Copyright (c) 2009-2012 Intel Corporation
Num logical cores: 16
Num sockets: 1
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2600000000 Hz
Package thermal spec power: 115 Watt; Package minimum power: 51 Watt; Package maximum power: 180 Watt;
Can not access SNB-EP (Jaketown) PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these SNB-EP counters in PCM.
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.
Detected Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz "Intel(r) microarchitecture codename Sandy Bridge-EP/Jaketown"
Executing "sleep 1" command:
Exit code: 0
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
0 0 0.00 0.28 0.00 1.15 2 8001 1.00 0.30 0.00 0.08 N/A N/A 58
1 0 0.00 0.46 0.00 1.15 4 7871 1.00 0.19 0.00 0.18 N/A N/A 59
2 0 0.00 0.48 0.00 1.15 1 8317 1.00 0.16 0.00 0.19 N/A N/A 66
3 0 0.00 0.48 0.00 1.15 11 8962 1.00 0.17 0.00 0.19 N/A N/A 57
4 0 0.00 0.44 0.00 1.15 2 8309 1.00 0.18 0.00 0.17 N/A N/A 59
5 0 0.00 0.43 0.00 1.15 15 8041 1.00 0.17 0.00 0.16 N/A N/A 56
6 0 0.00 0.43 0.00 1.15 37 9184 1.00 0.21 0.00 0.16 N/A N/A 57
7 0 0.00 0.56 0.00 1.15 1568 19 K 0.92 0.40 0.05 0.15 N/A N/A 54
8 0 0.00 0.28 0.00 1.15 4 1317 1.00 0.60 0.00 0.03 N/A N/A 58
9 0 0.00 0.40 0.00 1.15 1 897 1.00 0.67 0.00 0.03 N/A N/A 59
10 0 0.00 0.46 0.00 1.15 4 3019 1.00 0.41 0.00 0.08 N/A N/A 66
11 0 0.00 0.40 0.00 1.15 7 1288 0.99 0.59 0.00 0.05 N/A N/A 57
12 0 0.00 0.39 0.00 1.15 2 1822 1.00 0.52 0.00 0.06 N/A N/A 59
13 0 0.00 0.33 0.00 1.15 0 1207 1.00 0.54 0.00 0.03 N/A N/A 56
14 0 0.00 0.31 0.00 1.15 0 2155 1.00 0.43 0.00 0.05 N/A N/A 57
15 0 0.00 0.52 0.00 1.15 535 5644 0.91 0.47 0.04 0.09 N/A N/A 54
-------------------------------------------------------------------------------------------------------------------
SKT 0 0.00 0.43 0.00 1.15 2193 95 K 0.98 0.32 0.01 0.11 0.00 0.00 54
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.43 0.00 1.15 2193 95 K 0.98 0.32 0.01 0.11 0.00 0.00 N/A
Instructions retired: 14 M ; Active cycles: 33 M ; Time (TSC): 2613 Mticks ; C0 (active,non-halted) core residency: 0.07 %
C1 core residency: 99.93 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %
PHYSICAL CORE IPC : 0.85 => corresponds to 21.29 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.00 => corresponds to 0.02 % core utilization over time interval
----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
SKT 0 package consumed 45.71 Joules
----------------------------------------------------------------------------------------------
TOTAL: 45.71 Joules
----------------------------------------------------------------------------------------------
SKT 0 DIMMs consumed 9.78 Joules
----------------------------------------------------------------------------------------------
TOTAL: 9.78 Joules
Cleaning up
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.
====
Thanks, Prabhu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Prabhu,
the initSocket2Bus() is called from JKT_Uncore_Pci::JKT_Uncore_Pci which throws an error exception and also has debug output before throwing it. I dont see it either.
Is there a chance that you did not recompile the pcm.x binary after patching? Please do
[bash]
make clean
make
[/bash]
Thanks,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Also, out of the curiosuty what is the type of Dell system do you have? Is it "DELL WORKSTATION T7600" ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Roman Dementiev (Intel) wrote:
Prabhu,
the initSocket2Bus() is called from JKT_Uncore_Pci::JKT_Uncore_Pci which throws an error exception and also has debug output before throwing it. I dont see it either.
Is there a chance that you did not recompile the pcm.x binary after patching? Please do
make clean make
Thanks,
Roman
I have recompiled pcm.x, the output I have send earlier is based on the new pcm.x with debug patch.
Thanks, Prabhu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Roman Dementiev (Intel) wrote:
Also, out of the curiosuty what is the type of Dell system do you have? Is it "DELL WORKSTATION T7600" ?
The machine is a Dell - External OEMR XL R720
Thanks, Prabhu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Prabhu,
I saw the debug related code added in function initSocket2Bus,Is !socket2bus.empty() at line 2630 returning always true ? And the code below is not exercised
since you are running single socket you can remove the line with the "if(!socket2bus.empty()) return;". However, I don't expect this to help...
Did you see any error messages when patching? Could you please attach your patched cpucounters.cpp to your reply for me to check?
Thanks,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Prabhu,
we have just released new Intel PCM 2.5.1 (www.intel.com/software/pcm). Could you please try it and attach its output to your reply?
Thanks,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Roman,
THe new Intel PCM 2.6 also does not show the memory bandwidth. It reports the below error. I'm running as root.
Can not access Jaketown/Ivytown PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these Jaketown/Ivytown counters in PCM.
Thanks, Prabhu
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Roman,
Please find the full output:
Intel(r) Performance Counter Monitor V2.6 (2013-11-04 13:43:31 +0100 ID=db05e43)
Copyright (c) 2009-2013 Intel Corporation
Number of physical cores: 8
Number of logical cores: 16
Threads (logical cores) per physical core: 2
Num sockets: 1
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2600000000 Hz
Package thermal spec power: 115 Watt; Package minimum power: 51 Watt; Package maximum power: 180 Watt;
Can not access Jaketown/Ivytown PCI configuration space. Access to uncore counters (memory and QPI bandwidth) is disabled.
You must be root to access these Jaketown/Ivytown counters in PCM.
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.
Detected Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz "Intel(r) microarchitecture codename Sandy Bridge-EP/Jaketown"
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.44 0.01 1.15 48 K 802 K 0.94 0.47 0.04 0.14 0.00 0.00 N/A
Instructions retired: 100 M ; Active cycles: 228 M ; Time (TSC): 2604 Mticks ; C0 (active,non-halted) core residency: 0.47 %
C1 core residency: 99.53 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.88 => corresponds to 21.91 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.00 => corresponds to 0.12 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.48 0.01 1.15 71 K 997 K 0.93 0.51 0.04 0.12 0.00 0.00 N/A
Instructions retired: 144 M ; Active cycles: 302 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.63 %
C1 core residency: 99.37 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.95 => corresponds to 23.81 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.17 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.45 0.01 1.15 46 K 826 K 0.94 0.49 0.03 0.13 0.00 0.00 N/A
Instructions retired: 108 M ; Active cycles: 243 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.51 %
C1 core residency: 99.49 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.89 => corresponds to 22.34 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.13 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.46 0.01 1.15 54 K 885 K 0.94 0.49 0.04 0.13 0.00 0.00 N/A
Instructions retired: 117 M ; Active cycles: 257 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.53 %
C1 core residency: 99.47 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.91 => corresponds to 22.81 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.14 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.47 0.01 1.15 61 K 965 K 0.94 0.51 0.04 0.12 0.00 0.00 N/A
Instructions retired: 135 M ; Active cycles: 288 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.60 %
C1 core residency: 99.40 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.94 => corresponds to 23.45 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.16 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.50 0.01 1.15 83 K 1055 K 0.92 0.57 0.05 0.12 0.00 0.00 N/A
Instructions retired: 159 M ; Active cycles: 320 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.66 %
C1 core residency: 99.34 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 1.00 => corresponds to 24.95 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.19 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.52 0.01 1.15 106 K 1105 K 0.90 0.65 0.06 0.12 0.00 0.00 N/A
Instructions retired: 167 M ; Active cycles: 325 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.67 %
C1 core residency: 99.33 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 1.03 => corresponds to 25.75 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.20 % core utilization over time interval
----------------------------------------------------------------------------------------------
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
TEMP : Temperature reading in 1 degree Celsius relative to the TjMax temperature (thermal headroom): 0 corresponds to the max temperature
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE | TEMP
-------------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.49 0.01 1.15 102 K 1259 K 0.92 0.64 0.05 0.12 0.00 0.00 N/A
Instructions retired: 180 M ; Active cycles: 370 M ; Time (TSC): 2610 Mticks ; C0 (active,non-halted) core residency: 0.77 %
C1 core residency: 99.23 %; C3 core residency: 0.00 %; C6 core residency: 0.00 %; C7 core residency: 0.00 %;
C2 package residency: 0.00 %; C3 package residency: 0.00 %; C6 package residency: 0.00 %; C7 package residency: 0.00 %;
PHYSICAL CORE IPC : 0.98 => corresponds to 24.42 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.01 => corresponds to 0.22 % core utilization over time interval
----------------------------------------------------------------------------------------------
Cleaning up
Using PCM on your system might have a performance impact as per http://software.intel.com/en-us/articles/performance-impact-when-sampling-certain-llc-events-on-snb-ep-with-vtune
You can avoid the performance impact by using the option --noJKTWA, however the cache metrics might be wrong then.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- « Previous
-
- 1
- 2
- Next »