- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Intel VTune is a great performance analysis tool. I am currently experiencing some performance issues and would like to use this tool to analyze them.
Environment to run the program:
An x86 cluster, I can only log into the management node. Then submit the task to the compute node.
This is the script I used to submit the assignment,the main processes are:
- Request computing resources
- Generate the hostfile
- Execute the program via mpirun
#!/bin/bash
#DSUB -n template-26c
#DSUB -A huakemeiranshaoshiyanshi
#DSUB -T 1000h0m0s
#DSUB -N 1
#DSUB -R cpu=26
#DSUB -o out.%J
#DSUB -e err.%J
#DSUB --job_type cosched
cores='26'
app='FPVFoam_transNO_hybrid'
source /home/huakemeiranshaoshiyanshi/zliu/weizy/evn.sh
echo ----- print env vars -----
if [ "${CCSCHEDULER_ALLOC_FILE}" != "" ]; then
echo " "
ls -la ${CCSCHEDULER_ALLOC_FILE}
echo ------ cat ${CCSCHEDULER_ALLOC_FILE}
cat ${CCSCHEDULER_ALLOC_FILE}
fi
export HOSTFILE=/tmp/hostfile.$$
rm -rf $HOSTFILE
touch $HOSTFILE
ntask=`cat ${CCSCHEDULER_ALLOC_FILE} | awk -v fff="$HOSTFILE" '{}
{
split($0, a, " ")
if (length(a[1]) >0 && length(a[3]) >0) {
print a[1]":"a[2] >> fff
total_task+=a[3]
}
}END{print total_task}'`
echo "hostfile $HOSTFILE generated:"
echo "-----------------------"
cat $HOSTFILE
echo "-----------------------"
echo "Total tasks is $ntask"
echo "mpirun -hostfile $HOSTFILE -n $ntask <your application>"
{ time -p `which mpirun` --hostfile $HOSTFILE -np $cores -env UCX_NET_DEVICES=mlx5_0:1 -env UCX_IB_GID_INDEX=3 -launcher ssh -launcher-exec /opt/batch/agent/tools/dstart $app -parallel > template-26c-1.log; }
ret=$?
Compilers and MPI:
Because the software version I use is very old, I also use the older gcc4.8.5 compiler. The MPI version I use is mpich3.4.
(base) [zliu@cli01 ~]$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.8.5/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-bootstrap --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,java,fortran,ada,go,lto --enable-plugin --enable-initfini-array --disable-libgcj --with-isl=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/isl-install --with-cloog=/builddir/build/BUILD/gcc-4.8.5-20150702/obj-x86_64-redhat-linux/cloog-install --enable-gnu-indirect-function --with-tune=generic --with-arch_32=x86-64 --build=x86_64-redhat-linux
Thread model: posix
gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
(base) [zliu@cli01 ~]$ mpirun --version
HYDRA build details:
Version: 3.4b1
Release Date: Mon Oct 5 21:47:25 CDT 2020
CC: gcc -std=gnu99 -std=gnu99
Configure options: '--disable-option-checking' '--prefix=/share/app/mpich/mpichapp' '--with-device=ch4:ucx' '--with-ucx=/share/app/mpich/ucx' '--cache-file=/dev/null' '--srcdir=.' 'CC=gcc -std=gnu99 -std=gnu99' 'CFLAGS= -O2' 'LDFLAGS= -L/share/app/mpich/ucx/lib' 'LIBS=-lucp -lucp ' 'CPPFLAGS= -I/share/app/mpich/ucx/include -DNETMOD_INLINE=__netmod_inline_ucx__ -I/share/app/mpich/mpich-3.4b1/src/mpl/include -I/share/app/mpich/mpich-3.4b1/src/mpl/include -I/share/app/mpich/mpich-3.4b1/modules/yaksa/src/frontend/include -I/share/app/mpich/mpich-3.4b1/modules/yaksa/src/frontend/include -I/share/app/mpich/mpich-3.4b1/modules/json-c -I/share/app/mpich/mpich-3.4b1/modules/json-c -D_REENTRANT -I/share/app/mpich/mpich-3.4b1/src/mpi/romio/include' 'MPLLIBNAME=mpl'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs cobalt
Demux engines available: poll select
Now that I have installed Intel VTune, I would like to use it to analyze my performance problems.
My question is:
- In order to collect software features through Intel VTune, do I need to recompile the program using ICC and Intel MPI?
- How can I modify the script for submitting tasks so that VTune can collect information (I can't directly mpirun, I can only submit tasks)
- Do I just need to copy the collected result files to view the results locally in a graphical way. I also installed Intel VTune on my local windows.
In the end, I hope to get the result like this:
I hope to get the characteristics of MPI, IO, compute........
Thanks!
Link Copied
0 Replies

Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page