- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear all,
I am having problem destroying Intel MPI program, the original problem is described at this thread.
I am using impi/5.0.2.044/intel64, and my program is launched with "mpirun -machinefile mymachinefile ./myprogram"
I followed the suggestion to have the runtime executing "kill -<signal> <pid>", but doesn't work for signal 1, 2, 9, 15.
I have tried to used I_MPI_DEBUG=5 and still no pid get printed.
Is there any environmental variable I can use to get ALL pids related to the current launch, so I can send a kill signal to each process?
Or is there any setting that will ensure signals like keyboard interrupt will propagate to processes related?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Kin Fai,
You can try 'mpirun -cleanup' or I_MPI_HYDRA_CLEANUP environment variable. With this option the list of MPI processes is saved into a file. Then the processes can be cleaned up with 'mpicleanup' utility. I suppose there may be some limitations for spawned processes. See Intel® MPI Library for Linux* OS Reference Manual for details.
Regarding to:
Or is there any setting that will ensure signals like keyboard interrupt will propagate to processes related?
As far as I know it should be propagated by default. Do you have any problems with the propagation?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Regarding to the incorrect signal propagation - I've reproduced this for mpirun. I'll submit an internal ticket to fix this.
You can use mpiexec.hydra launcher instead of mpirun - there should be correct signal propagation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
For your reference, I have the following toy code in java you can check. I've also noticed the same issue when using python.
The returned Process from the runtime is actually a handle to something like "sh /$PATH_TO_MPIRUN/mpirun -np 6 ./a.out", so p.destroy(); actually destroy the shell, and mpirun may not be noticed of it. However, in the case of pressing CTRL+C in a terminal running mpirun, the interrupt can be propagated.
Anyway, thanks for your help, and I've solved my problem with the working workaround shown.
package test; import java.io.IOException; import java.lang.reflect.Field; public class TestMain { public static void main(String[] args) throws IOException, InterruptedException { Process p = Runtime.getRuntime().exec("mpirun -cleanup -tmpdir ./ -np 6 ./a.out"); int pid = 0; if (p.getClass().getName().equals("java.lang.UNIXProcess")) { try { Field f = p.getClass().getDeclaredField("pid"); f.setAccessible(true); System.out.println(pid = f.getInt(p)); } catch (Throwable e) { } } Thread.sleep(10000); // p.destroy(); // Dosen't work Runtime.getRuntime().exec("mpicleanup -i mpiexec_kftse_" + pid + ".log"); // worked } }
#include <mpi.h> int main(int argc, char* argv[]){ MPI_Init(&argc, &argv); while( true ){ } MPI_Finalize(); }
Best,
Kin Fai
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page