We are running the Nasa Overflow code on a large linux cluster and have found that if the code calls MPI_ABORT it does not terminate as
expected. We are running version 4.1.027 of Intel MPI. We running under the Torque resource manager.
We have the same issue with PBS as job scheduler and mpi version 5.0.3.048.
So the code sends an MPI_ABORT and the processes are not killed correctly that the job hangs in the queue.
Is there a solution to this problem?