Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

failed generate trace file with mpirun

Zhoulong_J_Intel
Сотрудник
1 162Просмотр.

Hi, 

My application is a python program and mpi is called as mpi4py(built with intel mpi), and it needs to be killed during the runing(it needs a long time, we only profile a little). I use LD_PRELOAD=libVTfs.so mpirun -trace -n 2 python My application, it didn't generate stf as doc said. It only generate a file folder which contains stat-0.bin and stat-1.bin(filesize=0), any wrong with my configure? I already source mpivars.sh, itacvars. Thanks very much!

0 баллов
1 Решение
James_S
Сотрудник
1 162Просмотр.

Hi Zhoulong,

Crashing applications or long running applications which are stopped by the user (e.g. check-pointed) do not produce an Intel® Trace Analyzer and Collector tracefile. Here are some ways to generate the trace file, can you please try:

1. By preloading the failsafe library:   mpirun –genv LD_PRELOAD libVTfs.so ...

    alternative:  export LD_PRELOAD=libVTfs.so

                       mpirun ...

2. By static linkage with libVTfs.so

3. The VT_CONFIG variable "DEADLOCK-TIMEOUT 10s" also works: if deadlock, interrupt and write a tracefile
 

Thanks,

Zhuowei

Просмотреть решение в исходном сообщении

2 Ответы
James_S
Сотрудник
1 163Просмотр.

Hi Zhoulong,

Crashing applications or long running applications which are stopped by the user (e.g. check-pointed) do not produce an Intel® Trace Analyzer and Collector tracefile. Here are some ways to generate the trace file, can you please try:

1. By preloading the failsafe library:   mpirun –genv LD_PRELOAD libVTfs.so ...

    alternative:  export LD_PRELOAD=libVTfs.so

                       mpirun ...

2. By static linkage with libVTfs.so

3. The VT_CONFIG variable "DEADLOCK-TIMEOUT 10s" also works: if deadlock, interrupt and write a tracefile
 

Thanks,

Zhuowei

Zhoulong_J_Intel
Сотрудник
1 162Просмотр.

thanks, it worked

Ответить