- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi all,
I am trying to analyze an application via vtune amplifier (parallel studio 2016). I had used following command line:
time mpirun -np 1 amplxe-cl -collect advanced-hotspots -result-dir result_$HOSTNAME ./myApp
and i got:-
real 96m46.763s
user 365m52.038s
sys 7m6.127s
And vtune's GUI reports:-
Elapsed Time: 345.404s CPU Time: 4287.747s Effective Time: 1095.314s Idle: 0.176s Poor: 407.717s Ok: 47.757s Ideal: 639.664s Over: 0s Spin Time: 3191.179s Imbalance or Serial Spinning (OpenMP): 229.153s Lock Contention (OpenMP): 2955.375s Communication (MPI): 0s Other: 6.651s Overhead Time: 1.254s Creation (OpenMP): 0s Scheduling (OpenMP): 0s Reduction (OpenMP): 0s Other: 1.254s Instructions Retired: 5,937,272,500,000 CPI Rate: 1.803 CPU Frequency Ratio: 1.000 Paused Time: 0s
Earlier i had an idea that that Vtune's reported "elapsed time" is the actual elapsed time of application.
Now i am bit confused .
Could anyone please throw a light on this issue.
Eagerly awaiting your replies.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The Linux "time" command will include the execution time of your job, plus execution time of the VTune post-processing step(s). These post-processing steps can be very time-consuming....
It is also possible that the "Elapsed Time" that VTune reports is an estimate based on the actual elapsed time minus an estimate of the overhead introduced by VTune.
If you re-run your code without VTune, the elapsed time should help make it clear how to interpret the VTune output.
Varying the number of OpenMP threads and seeing how this changes the VTune time breakdown should also help....
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
After recompilation (without -g flag) and re-run , instance #1 took:-
real 92m47.692s
user 365m15.774s
sys 6m44.852s
instance #2:-
real 89m16.851s
user 368m14.433s
sys 5m53.945s
It seems: profiled - non profiled =~ 5 minutes.
Does the elapsed time(Elapsed Time: 345.404s or ~5 Minutes
) in the vtune log means - the time taken by vtune for collecting data ??
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
John is right - Linux time command in your case will measure application under vtune collection time (that is Elapsed time that VTune shows) + Vtune result finalization time that can be significant.
So if you want to compare what Linux time command shows towards VTune elapsed time you can do the following:
mpirun -np 1 amplxe-cl -collect advanced-hotspots -result-dir result_$HOSTNAME time ./myApp
To avoid confusions - application elapsed time that VTune shows is elapsed time of the application under profiling and it will include the overhead that the tool introduces if any.
Also I would recommend to upgrade VTune to 2017 Update 1 and try the new HPC Performance Characterization analysis "hpc-performance" command line name - it will show you also OpenMP efficiency metrics along with memory/FPU metrics in a unified report.
Thanks & Regards, Dmitry

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page