The question is addressed to hackers/developers specialized in hardware-assisted virtualization.
Now I work on a simple hypervisor (proprietary) that uses VMX virtualization extensions. I have an emulator of local APIC interrupt controller (it works through MMIO interception) and it works nice. I started to improve the performance of Windows XP 32-bit guest by application of FlexPriority extensions and noticed some strange behavior.
When APIC access page is mapped to the guest through EPT for read access only all the things are O.K. TPR shadowing improves overall performance up to 3 times. Reads of TPR (at offset 0x80) are performed by HW without exiting, other accesses are virtualized by instruction emulation during APIC access exits.
But when APIC access page is mapped for reads and writes both, TPR shadowing results to BSOD in the guest: IRQL_IS_LESS_OR_EQUAL on access to 0x00000016. In this case all accesses to TPR are virtualized by HW.
Looks like TPR register value automatically stored (by CPU) to the virtual APIC page at offset 0x80 is in some inconsistent state and it cannot be trusted during computing of PPR value and making decisions on the next interrupt vector for servicing (injection). I printed values of HW-set TPR values during booting of Windows guest and compared them with software-virtualized APIC (without FlexPriority), and they were similar.
My configuration is the following:
I have a number of hypotheses on reasons of observed behavior:
But it looks that I loose something or understand mechanics behind virtual APIC page and APIC access page usage incorrectly. Could anybody suggest an assumption why TPR shadowing can result in BSODs when the same code works well with fair emulation through MMIO interception/exiting on R/W to APIC access page?