When running VTune Amplifier XE 2017 Beta Update 1 in a VMware Workstation x86 VM with virtual performance counters enabled, we sometimes see the following types of warnings in the log:
2016-10-10T15:22:46.434-04:00| vcpu-1| I125: VPMC: The guest wrote to the event selector of in-use virtual performance counter 0, which is disallowed.
2016-10-10T15:22:46.435-04:00| vcpu-1| I125: VPMC: The guest wrote to the event selector of in-use virtual performance counter 0, which is disallowed.
2016-10-10T15:22:46.436-04:00| vcpu-1| I125: VPMC: The guest wrote to the event selector of in-use virtual performance counter 0, which is disallowed.
VMware's virtual x86 performance counter implementation aims to expose virtual counters that aren't available by marking them "In-use" according to Intel's whitepaper on cooperative PMU sharing guidelines: https://software.intel.com/en-us/articles/performance-monitoring-unit-guidelines/
VMware's virtual x86 performance counter implementation drops writes from the guest operating system to virtual counters that are marked as in-use to not corrupt the real PMU HW, and issues these warnings. From these logs, it appears that VTune attempts to use virtual performance counters that are marked in-use. Does anyone have any specific knowledge about what VTune's policy is for counters that it finds as in-use (enabled) by other software?
VTune employs a different means of arbitration for PMU resources: it tries to acquire exclusive access to all counters by installing its PMI handler via an official OS interface (e.g., HalSetSystemInformation(HalProfileSourceInterruptHandler...) on Windows). By acquiring the PMI vector successfully, the tool assumes it is the only profiler in the system and can utilize all PMU resources.
And after the profiling session is over, all resources are relinquished for other tools to use.
Judging by your logs, VMWare locks out counter 0 for its internal use, but that may significantly limit functionality of in-guest profilers, because some performance events on certain generations of processors can be counted on counter 0 only, which is especially true for precise events (those making use of PEBS mechanism).
With that said, I'd recommend using the higher-order counters, e.g., counter 7, for VMM's needs, and reporting fewer counters to in-guest SW by virtualizing the appropriate leaf of CPUID instruction.
Thanks for explaining.
VMware's hypervisor almost never uses general purpose counters for its internal use, but if other VMs or software entities on the system are using hardware counters (such as the BIOS, or host OS), the VMware virtual CPU may expose some virtual counters as unavailable / "in-use" as defined by the PMU sharing guide in the hopes that other software entities (guest OS and guest applications) will notice and respect them by only using "available" virtual counters.
From your explanation, it appears VTune is intentionally designed to forcibly use all counters, and it disregards Intel's PMU sharing guide. Is this correct?
Sorry, the first link to the Intel PMU sharing guide whitepaper appears to be not working. I have located a copy of it here (I had to add the PDF extension): https://software.intel.com/sites/default/files/ea/95/30388