Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5094 Discussions

Profiling With Intel VTune Amplifier

Divino_C_
New Contributor I
1,111 Views

Why sometimes the "CPU Time by Utilization" is annotated in the function name istead of the function's instructions?
Also, if I sum up the CPU Time of all instructions from the function the total does not match the value attributed in the function name.

What is the effect of the "Collet Stacks" in the Lightweight Hotspot analysis? When I collect with this option enabled the results
tell me that the caller is the hotspot but when I profile with it disable it tells me that the callee is the hotspot, which one is correct?

When should I use lightweight hotspot analysis instead of hotspot analysis.

I use ubuntu 12.04 on Intel i7-3630QM.

PS: Is there any analyzis that show a timeline of an application (separated by thread) execution? I would like to see how much parallel work is being done.

0 Kudos
2 Replies
David_A_Intel1
Employee
1,111 Views

Hi Divino:

Why sometimes the "CPU Time by Utilization" is annotated in the function name istead of the function's instructions?
Also, if I sum up the CPU Time of all instructions from the function the total does not match the value attributed in the function name.

The VTune Amplifier XE does not record the time over every single instruction.  That is more like instruction simluation and the overhead is enormous!  Instead, VTune Amplifier XE does "periodic sampling" to show a statistical representation of what your application code is doing.

What is the effect of the "Collet Stacks" in the Lightweight Hotspot analysis? When I collect with this option enabled the results
tell me that the caller is the hotspot but when I profile with it disable it tells me that the callee is the hotspot, which one is correct?

That's it!  If you don't "collect stacks", you don't get the calling sequence to the hotspots.  Thus the term "lightweight hotspots".  If you collect only event-based sampling of CPU_CLK_UNHALTED and instructions retired, the overhead is very low, but you only get the location of the sample and not the calling sequence.

When should I use lightweight hotspot analysis instead of hotspot analysis.

You should use Lightweight Hotspots *without* stacks when you want to minimize the overhead of sampling.  Use Hotspots when you need to know the calling sequences.  Hotspots is more for algorithm tuning, while Lightweight Hotspots might be used to start your micro-architectural tuning (although, General Exploration is better suited for micro-architectural tuning).

PS: Is there any analyzis that show a timeline of an application (separated by thread) execution? I would like to see how much parallel work is being done.

All analysis types show a timeline of thread execution.  However, (Basic) Hotspots or Locks and Waits analysis types are better suited to analyzing parallel activity.

Note: starting with Update 9, the Hotspots analysis type is renamed "Basic Hotspots", while Lightweight Hotspots is renamed "Advanced Hotspots."

0 Kudos
Divino_C_
New Contributor I
1,111 Views

Hi Anderson,

thank you for spending time clarifying these points to me.

0 Kudos
Reply