Community
cancel
Showing results for 
Search instead for 
Did you mean: 
jimdempseyatthecove
Black Belt
34 Views

Improve multi-threading of VTune itself

When performing VTune (16.0.3) runs of a few minutes (e.g. 3), the time to read in and finalize the sample data is excruciating long. In looking at the resource display of the System Monitor (Linux on KNL) it appears that very few threads, perhaps only 1, is involved in preparing the data for analysis display.

Can this be made more multi-threaded?

Jim Dempsey

0 Kudos
1 Reply
Dmitry_P_Intel1
Employee
34 Views

Hello Jim,

The point is valid and VTune team has been working on parallelizing finalization step. Stay tuned on this.

So far you can use one tip. Collect on KNL with -no-auto-finalize option and then finalize results on Xeon host machine. Since single thread performance is better there - the results will be finalized faster. One thing to notice with this - you should explicitly set search directories for binaries on host with something like this:

>amplxe-cl -finalize -r <my_result_dir> -search-dir=<my_bin_dir> -search-dir=<openmp_runtime_dir>

Thanks & Regards, Dmitry

Reply