Analyzers
Talk to fellow users of Intel Analyzer tools (Intel VTune™ Profiler, Intel Advisor)
5250 ディスカッション

Analysing MPI applications using VTune on the Xeon Phi

Rupert_F_
ビギナー
430件の閲覧回数

Hi,

Could someone tell me if it is possible to use VTune with MPI applications on the Xeon Phi and, if so, explain how to do it.

The suggested way with VTune and MPI is to use the comand-line tool:

http://software.intel.com/sites/products/Whitepaper/Clustertools/amplxe_inspxe_interop_with_mpi.pdf

but the command line tool does not run directly on our Xeon Phi, and the Xeon Phi specific information about VTune that I've found e.g.:

http://software.intel.com/en-us/articles/optimization-and-performance-tuning-for-intel-xeon-phi-coprocessors-part-2-understanding

does not mention MPI.

Apologies if there is already documentation on how to do this.

Many thanks

Rupert

0 件の賞賛
1 返信
Peter_W_Intel
従業員
430件の閲覧回数

I tried MPI program on Xeon Phi(TM) Coprocessor. 

Here are steps:

1. Prepare environments for tools

# source /opt/intel/composer_xe_2013.1.117/bin/compilervars.sh intel64

# source /opt/intel/impi/4.1.0/bin64/mpivars.sh

# source /opt/intel/vtune_amplifier_xe_2013/amplxe-vars.sh 

2. Build & copy binary to mic device

# mpiicc -g -O3 -mmic program.c -o program

# scp program mic0:/root

3. Copy impi libraries onto mic device, and try

# scp /opt/intel/impi/4.1.0.024/mic/bin/mpiexec mic0:/bin

# scp /opt/intel/impi/4.1.0.024/mic/bin/pmi_proxy mic0:/bin

# scp /opt/intel/impi/4.1.0.024/mic/lib/libmpi.so.4 mic0:/lib64, or

# scp /opt/intel/impi/4.1.0.024/mic/lib/libmpi_dbg.so.4 mic0:/lib64

# scp /opt/intel/impi/4.1.0.024/mic/lib/libmpigf.so.4 mic0:/lib64

# scp /opt/intel/impi/4.1.0.024/mic/lib/libmpigc4.so.4 mic0:/lib64

# time ssh mic0 /bin/mpiexec -n 240 /root/program ; I have 244 cores on Phi coprocessor

4. Use VTune(TM) Amplifier XE to collect performance data

# amplxe-cl -collect knc_lightweight_hotspots -r mpi_res_target -search-dir all:rp=. -- ssh mic0 /bin/mpiexec -n 240 /root/program

5. All results from different core will be stored in different result directory. You can pick up any one to analyze (they are similar, for one process on one core)

Use amplxe-gui to open to analyze.

返信