- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I am working with an mpi application which hangs up after ~1 hour of run.
Here is the example of application launch -
mpirun -np 32 -ppn 4 ./wrf.exe
Currently, I am not sure about the line of code where the application hangs up so that i can setup few breakpoints in advance. Also, the compute nodes don't have X11. As the simulation gets stuck after ~1 hour of run and source codebase is very large, it is very difficult to debug this issue with
mpirun -gdb -np 32 -ppn 4 ./wrf.exe
I have a debug version (-g) of the executables. Is there a way through which i can attach to the hanged up/stuck MPI ranks/processes and check the current source code location of each rank ?
Could you please share the sequence of commands to be executed to analyze the problematic source code location -if possible?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
Could you please let us know the version of Intel mpi you have been using and OS details?
Could you also provide the output using the below command?
I_MPI_DEBUG=30 mpirun -check-mpi -np 32 -ppn 4 ./wrf.exe
Thanks & Regards
Shivani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As we didn't hear back from you, Could you please provide the details that have been asked in my previous post so that we can investigate more on your issue?
Thanks & Regards
Shivani
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance, please post a new question.
Thanks & Regards
Shivani
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page