Analyzers
Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4645 Discussions

Vtune: Accuracy of measurements using Intel sampling drivers on a machine running other tasks

HarshVardhanKumar
New Contributor I
620 Views

I have the latest coffeelake machine which is primarily used as a storage server. The average workload on each core (4 cores) is around 5-10% when running a storage server alone.

I want to run vtune measurements of a workload on this machine using Intel Sampling drivers. However, I'm doubtful whether or not the measurements will be accurate given the storage server application is concurrently running.

But as the intel's documents suggest, the sampling drivers get installed on the Linux kernel, so is it really the case that the measurements will be inaccurate if run concurrently with other applications? In other words, how exactly do the intel sampling drivers work? Are they able to distinguish between the workload process and other processes running on the system?

0 Kudos
1 Solution
AlekhyaV_Intel
Moderator
440 Views

Hi,


The problem with the skids is possible to mitigate by having the processor itself to store the instruction pointer (along with other information) in a designated buffer in memory. No interrupts are issued for each sample and the instruction pointer is off only by a single instruction, at most. This needs to be supported by the hardware, and is typically available only for a subset of supported events. So this capability is called Processor Event-Based Sampling (PEBS) on Intel processors.

You can refer the following link for more information on PEBS on Intel Processors:

https://easyperf.net/blog/2018/06/08/Advanced-profiling-topics-PEBS-and-LBR#processor-event-based-sa...


Regards,

Alekhya


View solution in original post

7 Replies
AlekhyaV_Intel
Moderator
553 Views

Hi,

 

Thank you for posting in Intel Forums. We're checking this further. We'll get back to you soon with an update.

 

Regards,

Alekhya

 

AlekhyaV_Intel
Moderator
514 Views

Hi

 

VTune uses Sampling drivers as part of Hardware event-based analysis to find the bottlenecks at CPU microarchitecture level. The sampling drivers use on-chip Performance Monitoring Units (PMUs) to collect the samples. A sample is a Hardware interruption that happens when a number of a certain hardware events counted. Each sample is associated with a task that is currently being executed by CPU and has a context – thread/process IDs, instruction pointer and etc. VTune collects all samples independently of number of running processes, then shows them distributed among all processes (Profile System mode) or associated to a single process (Launch/Attach modes). That's how the sampling drivers work.

 

On the other hand, the Hardware event-based sampling mechanism is not accurate due to skid of the instruction pointer effect. By the time the hardware interrupt is issued and caught, the instruction pointer is likely to be changed (progressed) and thus give a slightly inaccurate location of the code that triggered the event. It can be mitigated by using PEBS on Intel processors.

 

So, The accuracy of data collected by VTune does not vary with the number of applications running on the processor. However, when the processor is already loaded with other applications, they may interfere with each other and may affect the performance of the application. That's not caused by VTune. We have provided few links that would be helpful for you.

https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/analyze-performanc...

 

Regards,

Alekhya

 

AlekhyaV_Intel
Moderator
472 Views

Hi,


Could you please give us an update regarding this issue?


Regards,

Alekhya


HarshVardhanKumar
New Contributor I
459 Views

It can be mitigated by using PEBS on Intel processors.


This part I did not fully understand. Can you provide some links or Intel tutorials on these? The rest of the answer was helpful. Thanks.

Harsh

AlekhyaV_Intel
Moderator
441 Views

Hi,


The problem with the skids is possible to mitigate by having the processor itself to store the instruction pointer (along with other information) in a designated buffer in memory. No interrupts are issued for each sample and the instruction pointer is off only by a single instruction, at most. This needs to be supported by the hardware, and is typically available only for a subset of supported events. So this capability is called Processor Event-Based Sampling (PEBS) on Intel processors.

You can refer the following link for more information on PEBS on Intel Processors:

https://easyperf.net/blog/2018/06/08/Advanced-profiling-topics-PEBS-and-LBR#processor-event-based-sa...


Regards,

Alekhya


AlekhyaV_Intel
Moderator
394 Views

Hi,


Is your issue resolved? Could you please give us an update?


Regards,

Alekhya


AlekhyaV_Intel
Moderator
367 Views

Hi,


Thank you for accepting our solution. If you need any additional information, please submit a new question as this thread will no longer be monitored.


Regards,

Alekhya


Reply