Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2224 Discussions

Profiling a complex MPI Application : CESM (Community Earth System Model)

Nitin_Kundapur_B_
1,318 Views

Hello. 

CESM is a complex MPI climate model which is a highly parallel application. 

I am looking for ways to profile CESM runs. The default profiler provides profiling data for only a few routines. I have tried using external profilers like TAU, HPC Toolkit, Allinea Map, ITAC Traceanalyzer and VTune. 

As I was running CESM across a cluster (with 8 nodes - 16 processors each), it was most beneficial to use HPC Toolkit and Allinea Map for profiling. However, I am keen on finding two metrics for each CESM routine executed.  These are :

1) Total execution time of the function

2) Number of function calls made

Both of these do not provide the number of function calls made for a routine. 

The number of function calls made is important because this will help me find the time taken for execution of each call of a function. Just wanted to know if this has been achieved by anyone. Is there a way to do this with any of these tools? 

 

Thanks,

Nitin K Bhat

SERC,

Indian Institute of Science

0 Kudos
5 Replies
Gergana_S_Intel
Employee
1,318 Views

Hi Nitin,

I can only speak for the Intel tools since I've used those.  The Intel Trace Analyzer and Collector can provide the following information:

itac_flat_profile.jpg

As you can see, this shows a listing of all routines, how much time was spent in each one (for all MPI ranks), the # of times it was called, and the ratio.  Is this what you're looking for?

If you're running the Intel MPI Library, you can also gather statistics that'll give you similar info.  For that, you just have to set I_MPI_STATS=<level> where <level> is a value between 1 and 10 (depending on the amount of info you want to collect).  More info is available in the Statistics Gathering Mode section of our Reference Manual.

Let me know how this helps.  Perhaps others can chime in with their experiences.

Regards,
~Gergana

0 Kudos
Nitin_Kundapur_B_
1,318 Views

 

Thank you for the detailed and quick response. I have used the intel traceanalyzer previously. I am only able to find out the data for only the MPI communications. I intend to find the timings for the user code i.e each F90 routine being used in the code. (and I am not too keen on the overall MPI communication function execution times). 

Finding out the time for each F90 routine can include the time taken for MPI communication within that routine. When I expend on the user code in the trace analyzer, I don't see the times of individual routine times reflected here. 

 

Is there a way the user code could be profiled without the MPI statistics using the Intel Trace Analyzer? I am just able to use it as an MPI Tracer rather than an actual function profiler. 

Thanks, 

Nitin K Bhat, 

SERC,

Indian Institute of Science

0 Kudos
Nitin_Kundapur_B_
1,318 Views

Is it possible to profile user defined fortran routines using the ITAC collector? Using the -tcollect option? (As the documentation suggests that -tcollect performs source code instrumentation)

I wanted to try the -tcollect flag to check user defined routine profiling. I tried doing this for the NAS parallel benchmarks. I used the -tcollect flag in C flags and Fotran flags. I have sourced itacvars.sh. I have the following environment variables. 

[nitin@master NPB3.3-MPI]$ env | grep -i VT
VT_MPI=impi4
LD_PRELOAD=libVT.so
VT_ADD_LIBS=-ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread
VT_LIB_DIR=/opt/intel//itac/9.0.3.049/intel64/lib
VT_ROOT=/opt/intel//itac/9.0.3.049
VT_SLIB_DIR=/opt/intel//itac/9.0.3.049/intel64/slib
VT_ARCH=intel64


[nitin@master NPB3.3-MPI]$ env | grep -i I_MPI
I_MPI_F77=ifortre
I_MPI_STATS=1-10
I_MPI_F90=ifort
I_MPI_CC=icc
I_MPI_CXX=icpc
I_MPI_ROOT=/opt/intel//impi/5.0.3.048

 

When I try compiling the NAS parallel benchmarks with the -tcollect option, I get the following error. 

[nitin@master NPB3.3-MPI]$ make ft CLASS=B NPROCS=8
   =========================================
   =      NAS Parallel Benchmarks 3.3      =
   =      MPI/F77/C                        =
   =========================================

cd FT; make NPROCS=8 CLASS=B
make[1]: Entering directory `/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT'
make[2]: Entering directory `/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/sys'
cc -g  -o setparams setparams.c
make[2]: Leaving directory `/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/sys'
../sys/setparams ft 8 B
mpiifort -c  -g -tcollect=VT ft.f
cd ../common; mpiifort -c  -g -tcollect=VT randi8.f
cd ../common; mpiifort -c  -g -tcollect=VT print_results.f
cd ../common; mpiifort -c  -g -tcollect=VT timers.f
mpiifort  -o ../bin/ft.B.8 ft.o ../common/randi8.o ../common/print_results.o ../common/timers.o -L/opt/intel/itac/9.0.3.049/lib -ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread
ft.o: In function `ft':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:79: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:79: undefined reference to `__VT_IntelExit'
ft.o: In function `evolve':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:204: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:204: undefined reference to `__VT_IntelExit'
ft.o: In function `compute_initial_conditions':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:238: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:238: undefined reference to `__VT_IntelExit'
ft.o: In function `ipow46':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:291: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:291: undefined reference to `__VT_IntelExit'
ft.o: In function `setup':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:348: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:348: undefined reference to `__VT_IntelExit'
ft.o: In function `compute_indexmap':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:664: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:664: undefined reference to `__VT_IntelExit'
ft.o: In function `print_timers':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:746: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:746: undefined reference to `__VT_IntelExit'
ft.o: In function `fft':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:808: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:808: undefined reference to `__VT_IntelExit'
ft.o: In function `cffts1':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:902: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:902: undefined reference to `__VT_IntelExit'
ft.o: In function `cffts2':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:949: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:949: undefined reference to `__VT_IntelExit'
ft.o: In function `cffts3':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:996: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:996: undefined reference to `__VT_IntelExit'
ft.o: In function `fft_init':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1043: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1043: undefined reference to `__VT_IntelExit'
ft.o: In function `cfftz':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1087: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1087: undefined reference to `__VT_IntelExit'
ft.o: In function `fftz2':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1149: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1149: undefined reference to `__VT_IntelExit'
ft.o: In function `ilog2':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1208: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1208: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_yz':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1233: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1233: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_xy_z':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1258: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1258: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose2_local':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1282: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1282: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose2_global':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1351: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1351: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose2_finish':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1379: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1379: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_z':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1409: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1409: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_z_local':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1432: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1432: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_z_global':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1507: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1507: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_z_finish':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1537: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1537: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_y':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1582: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1582: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_y_local':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1616: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1616: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_y_global':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1644: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1644: undefined reference to `__VT_IntelExit'
ft.o: In function `transpose_x_y_finish':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1678: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1678: undefined reference to `__VT_IntelExit'
ft.o: In function `checksum':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1724: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1724: undefined reference to `__VT_IntelExit'
ft.o: In function `synchup':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1772: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1772: undefined reference to `__VT_IntelExit'
ft.o: In function `verify':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1791: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT/ft.f:1791: undefined reference to `__VT_IntelExit'
../common/randi8.o: In function `randlc':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/randi8.f:1: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/randi8.f:1: undefined reference to `__VT_IntelExit'
../common/randi8.o: In function `vranlc':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/randi8.f:42: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/randi8.f:42: undefined reference to `__VT_IntelExit'
../common/print_results.o: In function `print_results':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/print_results.f:2: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/print_results.f:2: undefined reference to `__VT_IntelExit'
../common/timers.o: In function `timer_clear':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:4: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:4: undefined reference to `__VT_IntelExit'
../common/timers.o: In function `timer_start':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:23: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:23: undefined reference to `__VT_IntelExit'
../common/timers.o: In function `timer_stop':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:43: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:43: undefined reference to `__VT_IntelExit'
../common/timers.o: In function `timer_read':
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:65: undefined reference to `__VT_IntelEntry'
/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/common/timers.f:65: undefined reference to `__VT_IntelExit'
make[1]: *** [../bin/ft.B.8] Error 1
make[1]: Leaving directory `/storage/home/nitin/NAS/NPB3.3.1/NPB3.3-MPI/FT'
make: *** [ft] Error 2

Why isn't the code able to find the VT function references? How to solve this error? 

 

Thanks,

Nitin K Bhat,

SERC,

IISc

 

 

0 Kudos
Nitin_Kundapur_B_
1,318 Views

Hello,

I managed to solve this error by explicitly specifying the library flags in the environement. 

In the libraries to be mentioned for linking in the environment, I mentioned the following: 

-L/opt/intel/itac/9.0.3.049/lib -lVT -ldwarf -lelf -lvtunwind -lnsl -lm -ldl -lpthread

After I added this, the error was solved and I was able to compile my code with the instrumentation. 

Thanks,

Nitin 

 

 

0 Kudos
Ron_Green
Moderator
1,318 Views

I found another way to build with collect and not have to use the -L and -l options above.

 

-tcollect is added to the compiler flags.  Change your linker from ld to mpiicpc or mpiicc, then for linker flags also add -tcollect.  -tcollect is recognized by mpiicpc/mpiicc/mpiifort when these drivers are used as your linker and the driver will bring in all the required libraries without you having to specify them with -L and -l options.

also, don't use -lm as this will bring in the system glibc libm.  Intel has replacement functions for libm which are optimized for Intel processors.  SO just remove -lm and the drivers will bring in the intel replacement libs.

 

 

 

 

0 Kudos
Reply