Profiling MPI applications using Vtune, collecting data only on select nodes

Paulius_V_ · ‎12-21-2016

Hello all.

I am trying to profile an MPI application. 6 nodes, 120 total MPI ranks.

I am interested in either a) Metrics on a single node or b) average metrics across all nodes.

Is there an easy way to do an option b) run?

Can anyone think of any ways that my data can be skewed if I'm only collecting metrics on a single rank?

Thank you.

TimP · ‎12-22-2016

This subject seems more likely to get the attention of experts if posted on the HPC clusters companion forum.

I don't consider it easy, but you may need to run VTune command line, collecting a separate .tb5 for each rank. This would enable you to analyze performance issues which aren't replicated in all ranks. Presumably easier would be to use Intel Trace Analyzer to determine whether there are such issues.

Useful results usually may be obtained by running a single copy of VTune when running a job on a single node.

Trace Analyzer is a descendant of the original vampir which is available as open source; another open source alternative is jumpshot.