In the past I used Oprofile for investigations on load imbalance because it has this nice table output with a column for each thread, see below. That make it easy to localize load imbalance. Is there a way to get a similar output formatting with VTune ?
Set Grouping to 'Thread/Function/Call Stack' and get the command line to generate the report by pressing the button '>_ ' on the right side.
vtune -report exec-query -rep-knob row-by="/GenericThread/Function/ParentCallStack" -sort-desc "CPU Time:Self" -rep-knob column-by="ViewpointGUIandCLIColumns" -r <result_dir>
链接已复制
VTune provides a similar format report.
You need to run the analysis type you need first, and then get report command line from the GUI, this can generate the report.
This is a hotspot result; you can press the >_ button to get the command line to generate the report.
Set Grouping to 'Thread/Function/Call Stack' and get the command line to generate the report by pressing the button '>_ ' on the right side.
vtune -report exec-query -rep-knob row-by="/GenericThread/Function/ParentCallStack" -sort-desc "CPU Time:Self" -rep-knob column-by="ViewpointGUIandCLIColumns" -r <result_dir>
