Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
1828 Discussions

failed generate trace file with mpirun

Zhoulong_J_Intel
Employee
174 Views

Hi, 

My application is a python program and mpi is called as mpi4py(built with intel mpi), and it needs to be killed during the runing(it needs a long time, we only profile a little). I use LD_PRELOAD=libVTfs.so mpirun -trace -n 2 python My application, it didn't generate stf as doc said. It only generate a file folder which contains stat-0.bin and stat-1.bin(filesize=0), any wrong with my configure? I already source mpivars.sh, itacvars. Thanks very much!

0 Kudos
1 Solution
James_S
Employee
174 Views

Hi Zhoulong,

Crashing applications or long running applications which are stopped by the user (e.g. check-pointed) do not produce an Intel® Trace Analyzer and Collector tracefile. Here are some ways to generate the trace file, can you please try:

1. By preloading the failsafe library:   mpirun –genv LD_PRELOAD libVTfs.so ...

    alternative:  export LD_PRELOAD=libVTfs.so

                       mpirun ...

2. By static linkage with libVTfs.so

3. The VT_CONFIG variable "DEADLOCK-TIMEOUT 10s" also works: if deadlock, interrupt and write a tracefile
 

Thanks,

Zhuowei

View solution in original post

2 Replies
James_S
Employee
175 Views

Hi Zhoulong,

Crashing applications or long running applications which are stopped by the user (e.g. check-pointed) do not produce an Intel® Trace Analyzer and Collector tracefile. Here are some ways to generate the trace file, can you please try:

1. By preloading the failsafe library:   mpirun –genv LD_PRELOAD libVTfs.so ...

    alternative:  export LD_PRELOAD=libVTfs.so

                       mpirun ...

2. By static linkage with libVTfs.so

3. The VT_CONFIG variable "DEADLOCK-TIMEOUT 10s" also works: if deadlock, interrupt and write a tracefile
 

Thanks,

Zhuowei

View solution in original post

Zhoulong_J_Intel
Employee
174 Views

thanks, it worked

Reply