I am new to VTune and was interested in knowing how does VTune collect the performance counter data? Does vTune capture counters while the OS code is executing? Are the counters being updated in user space or in kernel space?
VTune uses hardware PMU to measure. The general formula is -
MFLOPS Formula = A-specific-event counters / 1,000,000 / Elapsed Time
Elapsed time = CPU_CLK_UNHALTED.THREAD / Processor-Frequency / Number-of-Cores
Please see this blog to know examples in detail.
It is worth also to look at https://software.intel.com/en-us/node/544067 to understand the sampling collection method that VTune use for hardware (driver) based collections.
BTW - what particular HPC data are you interested in?
Thanks & Regards, Dmitry
And, actually, there are two different collection mechanisms: one is software-based and one is hardware-based. It depends on the analysis type you select.
So, again, what specifically are you asking about?