I am relatively new to MPI programming.
I am wondering how I can start up each process manually within the same MPI communicator world space.
In addition, when only a single process fails, how can this be detected and relaunched automatically, without crashing the other processors and the host?
Hope somebody can advise on this two issues. Thanks
If you want to start MPI process from your program use the MPI_Comm_spawn() call. Refer to the MPI standard for more details.
The fault tolerance support is not required by current standard. So, you should learn the particular MPI implementation which you use if it provide any such functionality.