How can VTune Amplifier be used to profile multiple Xen guest domains?



I'm seeking information on a way to profile multiple Xen guest domains (domUs) on a symmetric multiprocessor server using SSH. The server is running Ubuntu Server. I need to collect hardware event information from Xeon (Skylake) processors, while utilizing Cache Monitoring Technology (CAT) and Cache Monitoring Technology to calculate the throughput of each domU while they are executing applications.

Originally, I tried using OProfile version 0.9.5 patched with Xenoprof version 0.9.5. In doing this, I encountered errors indicating this version of OProfile is not supported by the kernel, architecture, or processors. To this end, I've tried installing VTune Amplifier on the host operating system (dom0), running on real hardware, as well as a domU, running on the Xen hypervisor, to collect the necessary data to calculate the throughput of each domU. However, the VTune Amplifier installer recognizes dom0 and a domU as virtual machines leading to the sampling drivers for hardware event-based sampling not being installed.

Is it possible to use VTune Amplifier to profile multiple domUs instead of just one domU? Can VTune Amplifier be used to calculate the throughput of each domU? Why is the VTune Amplifier installer indicating dom0 as a virtualized system when dom0 runs on real hardware?

Would it be more beneficial to build a new machine with Xeon processors and use a desktop version of Ubuntu to use VTune Amplifier? If so, which Xeon processor is recommended?

Your advice is very much appreciated.

Thank you.

