I am trying to analyze an application via vtune amplifier (parallel studio 2016). I had used following command line:
time mpirun -np 1 amplxe-cl -collect advanced-hotspots -result-dir result_$HOSTNAME ./myApp
and i got:-
And vtune's GUI reports:-
Elapsed Time: 345.404s CPU Time: 4287.747s Effective Time: 1095.314s Idle: 0.176s Poor: 407.717s Ok: 47.757s Ideal: 639.664s Over: 0s Spin Time: 3191.179s Imbalance or Serial Spinning (OpenMP): 229.153s Lock Contention (OpenMP): 2955.375s Communication (MPI): 0s Other: 6.651s Overhead Time: 1.254s Creation (OpenMP): 0s Scheduling (OpenMP): 0s Reduction (OpenMP): 0s Other: 1.254s Instructions Retired: 5,937,272,500,000 CPI Rate: 1.803 CPU Frequency Ratio: 1.000 Paused Time: 0s
Earlier i had an idea that that Vtune's reported "elapsed time" is the actual elapsed time of application.
Now i am bit confused .
Could anyone please throw a light on this issue.
Eagerly awaiting your replies.
The Linux "time" command will include the execution time of your job, plus execution time of the VTune post-processing step(s). These post-processing steps can be very time-consuming....
It is also possible that the "Elapsed Time" that VTune reports is an estimate based on the actual elapsed time minus an estimate of the overhead introduced by VTune.
If you re-run your code without VTune, the elapsed time should help make it clear how to interpret the VTune output.
Varying the number of OpenMP threads and seeing how this changes the VTune time breakdown should also help....
After recompilation (without -g flag) and re-run , instance #1 took:-
It seems: profiled - non profiled =~ 5 minutes.
Does the elapsed time(
Elapsed Time: 345.404s or ~5 Minutes) in the vtune log means - the time taken by vtune for collecting data ??
John is right - Linux time command in your case will measure application under vtune collection time (that is Elapsed time that VTune shows) + Vtune result finalization time that can be significant.
So if you want to compare what Linux time command shows towards VTune elapsed time you can do the following:
mpirun -np 1 amplxe-cl -collect advanced-hotspots -result-dir result_$HOSTNAME time ./myApp
To avoid confusions - application elapsed time that VTune shows is elapsed time of the application under profiling and it will include the overhead that the tool introduces if any.
Also I would recommend to upgrade VTune to 2017 Update 1 and try the new HPC Performance Characterization analysis "hpc-performance" command line name - it will show you also OpenMP efficiency metrics along with memory/FPU metrics in a unified report.
Thanks & Regards, Dmitry