I've got some pretty basic (I think) questions that I don't see covered in the documentation - any pointers would be greatly appreciated.
- after we've ran an activity which file(s) do we have to copy if we want to analyse the results in a different machine? And can the runs made with the command line version be opened with the eclipse version of vtune?
We would also like to know if it's possible in any way to, using data gathered with VTune, see if our application is memory bandwidth limited, as we need to compare the performance of dual dual core machines versus dual quad core ones.
In order to perform the analysis phase on a different machine from the collection, it should be sufficient to copy the executable, source files, and .tb5 data collection files. It would not matter whether those files were collected under GUI or command line. A frequently quoted answer on memory bandwidth limitation is to use the BUS_TRANS_BURST_SELF event counter to compare the actual application with the maximum capability of the platform. My preference, for your application, would be to see whether any hot spots which don't scale well to quad core are those with a high rate of memory access events. In either case, it is important to get good core affinity (lock threads to appropriate cores as much as possible); both performance problems and lack of repeatability are associated with frequent swapping of threads and cores.