Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

hydra hangs on defunct processes

Scarmozzino__Robert
1,337 Views

Hi,

 

If I use mpiexec to run program A, which in turn runs program B in the background, when program A returns, mpiexec hangs until program B completes.  For example, if program A is the following shell script:

programA

        programB &

 

and program B is the following shell script:

programB

        sleep 10

 

Then even though programA returns, mpiexec does not, until sleep completes or is killed.  programA shows up as defunct.

On the other hand, directly running programA from the shell it does not show up as defunct.

What is mpiexec doing or waiting on that causes this, and is there any way around it?

 

 

0 Kudos
6 Replies
PrasanthD_intel
Moderator
1,328 Views

Hi Robert,


Could you please specify how were you launching MPI programB from ProgramA; Were you using MPI_COMM_SPAWN or you were launching through a shell script (shell script starts a program that calls MPI_INIT)?


If possible, please provide a sample reproducer with your command line, that would help us answer better.


Regards

Prasanth


0 Kudos
Scarmozzino__Robert
1,320 Views

The launch command is simply:

        mpiexec -np 1 programA

In this test case, there is in fact no MPI program.  I am just using the launcher to spawn programA, which is just a shell script.  In the real application, programA is an actual MPI program, which spawns programB in the background, intending to leave it when it completes.  In the real application, all of the MPI stuff completes, but it is waiting and shows as defunct because of programB.  The simple example I sent demonstrates the essence of the issue I think.

0 Kudos
PrasanthD_intel
Moderator
1,286 Views

Hi Robert,


Sorry for the delay,

We are not sure whether this is expected behaviour or not.

We are discussing with the internal team and get back to you after cross-checking with MPI 3.1 standard.

Thanks for being patient.


Regards

Prasanth


0 Kudos
PrasanthD_intel
Moderator
1,270 Views

Hi Robert,


The behaviour for launching inside a shell script is undefined but since you are asking for a way around it.

We are transferring this query to Subject Matters experts for better support.


Regards

Prasanth


0 Kudos
Kevin_O_Intel1
Employee
1,268 Views

investigating if there is a workaround


0 Kudos
Kevin_O_Intel1
Employee
1,259 Views

I have discussed this with the team here.

This seems like standard linux process behavior... unclear if there is a workaround

Is there any way you could launch the exec instead of the script?


0 Kudos
Reply