- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear IntelMPI experts,
There is an interesting option I_MPI_HYDRA_CLEANUP (and I_MPI_TMPDIR) documented, again for mpiexec.hydra, that supposed to create a file with list of PIDs; then, in case of problems, these can be fed to mpicleanup.
Somehow I've failed to get it working (using IMPI ver. 4.0.3.008). DOes this option work? Could somebody give me an example of its usage? Thank you very much!
--
Grigory Shamov
HPC Analyst
University of Manitoba
There is an interesting option I_MPI_HYDRA_CLEANUP (and I_MPI_TMPDIR) documented, again for mpiexec.hydra, that supposed to create a file with list of PIDs; then, in case of problems, these can be fed to mpicleanup.
Somehow I've failed to get it working (using IMPI ver. 4.0.3.008). DOes this option work? Could somebody give me an example of its usage? Thank you very much!
--
Grigory Shamov
HPC Analyst
University of Manitoba
Link Copied
1 Reply
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Grigory,
Keep in mind that the file created by enabling I_MPI_HYDRA_CLEANUP is deleted when the job ends, even if the job ends incorrectly. Its purpose is to provide a means of finding the jobs if Hydra ends incorrectly and the processes are still running. The file is named
mpiexec_${username}_$PPID.log
PPID isthe parent process PID. I_MPI_TMPDIR is used to set the path to store this file. An example of the contents of the file:
compute-0-0 13352 13355 13356 13357 13358 13359 13360 13361 13362 13363 13364 13365 13366 compute-0-1 11339 11342 11343 11344 11345
This gives the node (compute-0-0 or compute-0-1 in this case) and the PID on the node of the process.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
Keep in mind that the file created by enabling I_MPI_HYDRA_CLEANUP is deleted when the job ends, even if the job ends incorrectly. Its purpose is to provide a means of finding the jobs if Hydra ends incorrectly and the processes are still running. The file is named
mpiexec_${username}_$PPID.log
PPID isthe parent process PID. I_MPI_TMPDIR is used to set the path to store this file. An example of the contents of the file:
compute-0-0 13352 13355 13356 13357 13358 13359 13360 13361 13362 13363 13364 13365 13366 compute-0-1 11339 11342 11343 11344 11345
This gives the node (compute-0-0 or compute-0-1 in this case) and the PID on the node of the process.
Sincerely,
James Tullos
Technical Consulting Engineer
Intel Cluster Tools
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page