Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

hydra hangs on defunct processes

Scarmozzino__Robert
1,331 Views

Hi,

 

If I use mpiexec to run program A, which in turn runs program B in the background, when program A returns, mpiexec hangs until program B completes.  For example, if program A is the following shell script:

programA

        programB &

 

and program B is the following shell script:

programB

        sleep 10

 

Then even though programA returns, mpiexec does not, until sleep completes or is killed.  programA shows up as defunct.

On the other hand, directly running programA from the shell it does not show up as defunct.

What is mpiexec doing or waiting on that causes this, and is there any way around it?

 

 

0 Kudos
6 Replies
PrasanthD_intel
Moderator
1,322 Views

Hi Robert,


Could you please specify how were you launching MPI programB from ProgramA; Were you using MPI_COMM_SPAWN or you were launching through a shell script (shell script starts a program that calls MPI_INIT)?


If possible, please provide a sample reproducer with your command line, that would help us answer better.


Regards

Prasanth


0 Kudos
Scarmozzino__Robert
1,314 Views

The launch command is simply:

        mpiexec -np 1 programA

In this test case, there is in fact no MPI program.  I am just using the launcher to spawn programA, which is just a shell script.  In the real application, programA is an actual MPI program, which spawns programB in the background, intending to leave it when it completes.  In the real application, all of the MPI stuff completes, but it is waiting and shows as defunct because of programB.  The simple example I sent demonstrates the essence of the issue I think.

0 Kudos
PrasanthD_intel
Moderator
1,280 Views

Hi Robert,


Sorry for the delay,

We are not sure whether this is expected behaviour or not.

We are discussing with the internal team and get back to you after cross-checking with MPI 3.1 standard.

Thanks for being patient.


Regards

Prasanth


0 Kudos
PrasanthD_intel
Moderator
1,264 Views

Hi Robert,


The behaviour for launching inside a shell script is undefined but since you are asking for a way around it.

We are transferring this query to Subject Matters experts for better support.


Regards

Prasanth


0 Kudos
Kevin_O_Intel1
Employee
1,262 Views

investigating if there is a workaround


0 Kudos
Kevin_O_Intel1
Employee
1,253 Views

I have discussed this with the team here.

This seems like standard linux process behavior... unclear if there is a workaround

Is there any way you could launch the exec instead of the script?


0 Kudos
Reply