- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Trying to run pcm.x (PCM 1.7) Redhat 5.7 BL460 G6 ( Intel Xeon CPU X5670 @ 2.93GHz) it fails (while with KDE 3.5 I'm able to get it running).
Can you please advise about the following message?
(reset doesn't help)
Thanks,
Ilan
./pcm.x 1 -nc -ns
Intel Performance Counter Monitor
Copyright (c) 2009-2011 Intel Corporation
Num cores: 24
Num sockets: 2
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2933333326 Hz
Access to Intel Performance Counter Monitor has denied (Performance Monitoring Unit is occupied by other application). Try to stop the application that uses PMU.
Alternatively you can try to reset PMU configuration at your own risk. Try to reset? (y/n)
y
PMU configuration has been reset. Try to rerun the program again.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kind regards
Thomas
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Only process accessing the /dev/cpu/?/msr is the pcm.x.
Further using "modprobe -l |grep msr" I couldn't see any module I can corelate with msr
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Could you try to reset PMU and run ./pcm.x 1 again? Or did you try that already?
Are other tools or drivers using PMU are running? This could be Linux perf, oprofile, VTune, PAPI, etc.
Best regards,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I have tried reseting it before opening the call - it doesn'tmake a change
Version is Redhat 5.7
cat /etc/redhat-release
Red Hat Enterprise Linux Server release 5.7 (Tikanga)
uname -a
Linux illinrh602 2.6.18-274.12.1.el5 #1 SMP Tue Nov 8 21:37:35 EST 2011 x86_64 x86_64 x86_64 GNU/Linux
Only performance tool running is the HP GlancePlusPAK - I have made sure it is down before running pcm.x
Question - how can I find who is accessing the PMU?
the server HT is enabled - should it make a change?
strace of the pcm.x shows successful open for all /dev/cpu/?/msr and then the failure message appears:
10662 open("/dev/cpu/22/msr", O_RDWR) = 25
10662 open("/dev/cpu/23/msr", O_RDWR) = 26
10662 lseek(3, 206, SEEK_SET) = 206
10662 read(3, "\3\26\1\4\0\f\0\0", 8) = 8
10662 write(1, "Nominal core frequency: 29333333"..., 38) = 38
10662 futex(0x2b2b405a2000, FUTEX_WAKE, 1) = 0
10662 open("/dev/shm/sem.Intel Performance Counter Monitor instance create-destroy lock", O_RDWR|O_NOFOLLOW) = 27
10662 fstat(27, {st_mode=S_IFREG|0755, st_size=32, ...}) = 0
10662 close(27) = 0
10662 open("/dev/shm/sem.Number of running Intel Performance Counter Monitor instances", O_RDWR|O_NOFOLLOW) = 27
10662 fstat(27, {st_mode=S_IFREG|0755, st_size=32, ...}) = 0
10662 mmap(NULL, 32, PROT_READ|PROT_WRITE, MAP_SHARED, 27, 0) = 0x2b2b405a3000
10662 close(27) = 0
10662 futex(0x2b2b405a3000, FUTEX_WAKE, 1) = 0
10662 lseek(3, 911, SEEK_SET) = 911
10662 read(3, "\0\0\0\0\7\0\0\0", 8) = 8
10662 lseek(3, 390, SEEK_SET) = 390
10662 read(3, "\0\0\0\0\0\0\0\0", 8) = 8
10662 lseek(3, 391, SEEK_SET) = 391
10662 read(3, "\0\0\0\0\0\0\0\0", 8) = 8
10662 lseek(3, 392, SEEK_SET) = 392
10662 read(3, "\0\0\0\0\0\0\0\0", 8) = 8
10662 lseek(3, 393, SEEK_SET) = 393
10662 read(3, "\0\0\0\0\0\0\0\0", 8) = 8
10662 lseek(3, 909, SEEK_SET) = 909
10662 read(3, "0\0\0\0\0\0\0\0", 8) = 8
10662 futex(0x2b2b405a2000, FUTEX_WAKE, 1) = 0
10662 write(1, "Access to Intel Performance C"..., 165) = 165
10662 write(1, "Alternatively you can try to res"..., 91) = 91
10662 fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
10662 mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x2b2b405a4000
10662 read(0, "y\n", 1024) = 2
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Module Size Used by
vxgms 316848 0
vxglm 271568 0
vxfen 300904 1
gab 273056 5 vxfen
llt 194328 5 gab
nfsd 287464 17
exportfs 38849 1 nfsd
auth_rpcgss 81889 1 nfsd
autofs4 63049 3
hidp 83521 2
freevxfs 47817 0
nfs 293145 8
nfs_acl 36673 2 nfsd,nfs
rfcomm 104681 0
l2cap 89537 10 hidp,rfcomm
bluetooth 118725 5 hidp,rfcomm,l2cap
lockd 101425 3 nfsd,nfs
sunrpc 203273 22 nfsd,auth_rpcgss,nfs,nfs_acl,lockd
be2iscsi 94173 0
ib_iser 68161 0
rdma_cm 68689 1 ib_iser
ib_cm 72809 1 rdma_cm
iw_cm 43465 1 rdma_cm
ib_sa 74953 2 rdma_cm,ib_cm
ib_mad 70757 2 ib_cm,ib_sa
ib_core 104901 6 ib_iser,rdma_cm,ib_cm,iw_cm,ib_sa,ib_mad
ib_addr 41673 1 rdma_cm
iscsi_tcp 50893 0
bnx2i 77665 0
cnic 84457 1 bnx2i
ipv6 436449 91 cnic
xfrm_nalgo 43333 1 ipv6
crypto_api 42945 1 xfrm_nalgo
uio 45777 1 cnic
cxgb3i 64849 0
libcxgbi 91597 1 cxgb3i
cxgb3 215600 1 cxgb3i
libiscsi_tcp 53573 3 iscsi_tcp,cxgb3i,libcxgbi
libiscsi2 77765 7 be2iscsi,ib_iser,iscsi_tcp,bnx2i,cxgb3i,libcxgbi,libiscsi_tcp
scsi_transport_iscsi2 73945 8 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2
scsi_transport_iscsi 35017 1 scsi_transport_iscsi2
dm_round_robin 36801 1
dm_multipath 58713 2 dm_round_robin
scsi_dh 42561 1 dm_multipath
video 53197 0
backlight 39873 1 video
sbs 49921 0
power_meter 47053 0
hwmon 36553 1 power_meter
i2c_ec 38593 1 sbs
i2c_core 57537 1 i2c_ec
dell_wmi 37601 0
wmi 41985 1 dell_wmi
button 40545 0
battery 43849 0
asus_acpi 50917 0
acpi_memhotplug 40517 0
ac 38729 0
parport_pc 62313 0
lp 47121 0
parport 73165 2 parport_pc,lp
sg 70649 0
shpchp 70893 0
bnx2x 944273 0
tpm_tis 48077 0
tpm 50273 1 tpm_tis
hpilo 44497 0
i7core_edac 46793 0
8021q 58449 2 cxgb3,bnx2x
tpm_bios 40897 1 tpm
serio_raw 40517 0
edac_mc 61217 1 i7core_edac
mdio 38465 1 bnx2x
pcspkr 36289 0
dm_raid45 99785 0
dm_message 36289 1 dm_raid45
dm_region_hash 46144 1 dm_raid45
dm_mem_cache 38977 1 dm_raid45
dm_snapshot 52233 0
dm_zero 35265 0
dm_mirror 54737 0
dm_log 44993 3 dm_raid45,dm_region_hash,dm_mirror
dm_mod 102289 40 dm_multipath,dm_raid45,dm_snapshot,dm_zero,dm_mirror,dm_log
qla2xxx 1227393 40
scsi_transport_fc 83145 1 qla2xxx
ata_piix 57541 3
libata 208977 1 ata_piix
sd_mod 56513 25
scsi_mod 199385 13 be2iscsi,ib_iser,iscsi_tcp,bnx2i,libcxgbi,libiscsi2,scsi_transport_iscsi2,scsi_dh,sg,qla2xxx,scsi_transport_fc,lib
ata,sd_mod
ext3 169297 7
jbd 94897 1 ext3
uhci_hcd 57433 0
ohci_hcd 56181 0
ehci_hcd 66381 0
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
From the list above I could not see any usual suspects that program PMU. It is not possible to find who is using PMU right now in general.
Last try would be the following steps (in this order):
1. reboot your machine
2. stop mentioned by you perf tool
3. ./pcm.x 1 (answer y to reset PMU)
4. ./pcm.x 1 (starting again)
If this does not help I would like to see some debug output from pcm.x: If you can, in PCM::PMUinUse() function please enable all debug output (now disabled by C++ comment //): "// std::cout" => "std::cout". "make clean" then "make". Then
1. ./pcm.x 1 (answer y to reset PMU)
2. ./pcm.x 1 (starting again)
3. Post here the output of (1) and (2)
Thank you.
Most of the PMU tools do not doPMU-busy check anyway, you can disable it too in the source code: but if this unknown tool will reprogram PMU while PCM is working, the results from PCM will not be reliable.
To disable PMU busy check you can modify the PCM::PMUinUse() function to return false:
[cpp]bool PCM::PMUinUse() { return false; // ... }[/cpp]
Best regards,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
below is the output with debug ouptut enaled (reset didn't help..)
Then I commented the call to PMUinUse (the two lines in the if the call and the return)
and it worked.
I then uncommented back - and it is still working for me - see output after the "***" line.
Two more questions if I may
- If I increase the interval - would it provide me for the averages of that interval?
- is there any option to accumulate the results for a long interval and calculate averages?
Thanks,
./pcm.x 1 -ns -nc
Intel Performance Counter Monitor
Copyright (c) 2009-2011 Intel Corporation
Num cores: 24
Num sockets: 2
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2933333326 Hz
Core 0 IA32_CR_PERF_GLOBAL_CTRL is 0
Access to Intel Performance Counter Monitor has denied (Performance Monitoring Unit is occupied by other application). Try to stop the application that uses PMU.
Alternatively you can try to reset PMU configuration at your own risk. Try to reset? (y/n)
y
PMU configuration has been reset....
**********************************************************************************
Intel Performance Counter Monitor
Copyright (c) 2009-2011 Intel Corporation
Num cores: 24
Num sockets: 2
Threads per core: 2
Core PMU (perfmon) version: 3
Number of core PMU generic (programmable) counters: 4
Width of generic (programmable) counters: 48 bits
Number of core PMU fixed counters: 3
Width of fixed counters: 48 bits
Nominal core frequency: 2933333326 Hz
Core 0 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 1 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 2 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 3 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 4 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 5 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 6 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 7 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 8 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 9 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 10 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 11 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 12 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 13 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 14 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 15 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 16 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 17 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 18 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 19 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 20 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 21 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 22 IA32_CR_PERF_GLOBAL_CTRL is 700000000
Core 23 IA32_CR_PERF_GLOBAL_CTRL is 700000000
EXEC : instructions per nominal CPU cycle
IPC : instructions per CPU cycle
FREQ : relation to nominal CPU frequency='unhalted clock ticks'/'invariant timer ticks' (includes Intel Turbo Boost)
AFREQ : relation to nominal CPU frequency while in active state (not in power-saving C state)='unhalted clock ticks'/'invariant timer ticks
while in C0-state' (includes Intel Turbo Boost)
L3MISS: L3 cache misses
L2MISS: L2 cache misses (including other core's L2 cache *hits*)
L3HIT : L3 cache hit ratio (0.00-1.00)
L2HIT : L2 cache hit ratio (0.00-1.00)
L3CLK : ratio of CPU cycles lost due to L3 cache misses (0.00-1.00), in some cases could be >1.0 due to a higher memory latency
L2CLK : ratio of CPU cycles lost due to missing L2 cache but still hitting L3 cache (0.00-1.00)
READ : bytes read from memory controller (in GBytes)
WRITE : bytes written to memory controller (in GBytes)
Core (SKT) | EXEC | IPC | FREQ | AFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3CLK | L2CLK | READ | WRITE
------------------------------------------------------------------------------------------------------------
TOTAL * 0.00 0.05 0.02 0.55 27 K 1311 K 0.98 0.00 0.00 0.05 0.01 0.00
Instructions retired: 56 M ; Active cycles: 1062 M ; Time (TSC): 336 Tticks ; C0 (active,non-halted) core residency: 2.76 %
PHYSICAL CORE IPC : 0.11 => corresponds to 2.66 % utilization for cores in active state
Instructions per nominal CPU cycle: 0.00 => corresponds to 0.04 % core utilization over time interval
----------------------------------------------------------------------------------------------
Total QPI incoming data traffic: 6944 K QPI data traffic/Memory controller traffic: 0.58
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you increase interval to x seconds, the pcm.x utility will output delta of performance counters for x seconds. To compute average value per second just divide all values by x.Two more questions if I may
- If I increase the interval - would it provide me for the averages of that interval?
- is there any option to accumulate the results for a long interval and calculate averages?
Other mode of collection is to let pcm.x to start a command and then it waits for its completion and outputs the delta of performance counters before the command start and after command finishes. For example:
[bash]./pcm.x "sh ./my_benchmark_script.sh parameter1 parameter2"[/bash]
pcm.x will start my_benchmark_script using shell and output counter deltas when it is done.
Best regards,
Roman
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
the reason of such behavior mightalso be a BIOS related issue mentioned in https://lkml.org/lkml/2011/3/24/552
Could you please check the Linux boot log using "dmesg | grep PMU" and see if youobserve a similar message:
"[Firmware Bug]: the BIOS has corrupted hw-PMU resources (MSR 38d is 330)"
Disabling the PMU busy check in PCM "PCM::PMUinUse() function to return always false" should be a quick workaround for your system. Please let us know if an HP BIOS updateis available and can fix the issue.
Thanks,
Roman
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page