- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can Vtune 9.1 Update 2 be used to instrumentcode running concurrently on multiple nodes? If so, where do I need to go documentation wise to figure out how I need to install and configure things?
Or, is this a question best left to the Intel support team?
Thank you.
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Quoting - slojuggler
Can Vtune 9.1 Update 2 be used to instrumentcode running concurrently on multiple nodes? If so, where do I need to go documentation wise to figure out how I need to install and configure things?
Or, is this a question best left to the Intel support team?
Thank you.
Intel VTune Performance Analyzer is not designed for cluster system. However you can simulate distributed computing in one node, and refer to http://software.intel.com/en-us/articles/performance-tools-for-software-developers-does-vtune-analyzer-support-profiling-of-mpi-applications/
Thanks, Peter
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As Peter said, it's often suitable to profile an entire MPI job running on a single node. It's easier to see the MPI functions when the MPI is static linked, even though that may not be your normal running mode. Also, with either MPI or OpenMP, or a hybrid combination, it's helpful to raise the spin lock transition numbers so that all wait times are counted in the application.
A possible way of profiling on multiple nodes is by using the PTU relative of VTune (see WhatIf forum) to generate an SEP batch command which may be run across a cluster under MPI, saving a tb5 file for each node.
As you're probably aware, specialized MPI profilers, such as jumpshot or Intel Trace Collector/Analyzer, are best for profiling to see the messaging paths and latencies. If you have Intel Trace Collector installed (basically, a profiling MPI library), and Intel MPI dynamic linked, you can activate profiling simply by adding -trace to the mpiexec command.
For MPI/OpenMP hybrid, the Intel profiling OpenMP library is useful for profiling the OpenMP process.
A possible way of profiling on multiple nodes is by using the PTU relative of VTune (see WhatIf forum) to generate an SEP batch command which may be run across a cluster under MPI, saving a tb5 file for each node.
As you're probably aware, specialized MPI profilers, such as jumpshot or Intel Trace Collector/Analyzer, are best for profiling to see the messaging paths and latencies. If you have Intel Trace Collector installed (basically, a profiling MPI library), and Intel MPI dynamic linked, you can activate profiling simply by adding -trace to the mpiexec command.
For MPI/OpenMP hybrid, the Intel profiling OpenMP library is useful for profiling the OpenMP process.

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page