Analyzers
Support for Analyzers (Intel VTune™ Profiler, Intel Advisor, Intel Inspector)
4644 Discussions

Improve multi-threading of VTune itself

jimdempseyatthecove
Black Belt
101 Views

When performing VTune (16.0.3) runs of a few minutes (e.g. 3), the time to read in and finalize the sample data is excruciating long. In looking at the resource display of the System Monitor (Linux on KNL) it appears that very few threads, perhaps only 1, is involved in preparing the data for analysis display.

Can this be made more multi-threaded?

Jim Dempsey

0 Kudos
1 Reply
Dmitry_P_Intel1
Employee
101 Views

Hello Jim,

The point is valid and VTune team has been working on parallelizing finalization step. Stay tuned on this.

So far you can use one tip. Collect on KNL with -no-auto-finalize option and then finalize results on Xeon host machine. Since single thread performance is better there - the results will be finalized faster. One thing to notice with this - you should explicitly set search directories for binaries on host with something like this:

>amplxe-cl -finalize -r <my_result_dir> -search-dir=<my_bin_dir> -search-dir=<openmp_runtime_dir>

Thanks & Regards, Dmitry

Reply