Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

hydra hangs on defunct processes

Scarmozzino__Robert
421 Views

Hi,

 

If I use mpiexec to run program A, which in turn runs program B in the background, when program A returns, mpiexec hangs until program B completes.  For example, if program A is the following shell script:

programA

        programB &

 

and program B is the following shell script:

programB

        sleep 10

 

Then even though programA returns, mpiexec does not, until sleep completes or is killed.  programA shows up as defunct.

On the other hand, directly running programA from the shell it does not show up as defunct.

What is mpiexec doing or waiting on that causes this, and is there any way around it?

 

 

0 Kudos
6 Replies
PrasanthD_intel
Moderator
412 Views

Hi Robert,


Could you please specify how were you launching MPI programB from ProgramA; Were you using MPI_COMM_SPAWN or you were launching through a shell script (shell script starts a program that calls MPI_INIT)?


If possible, please provide a sample reproducer with your command line, that would help us answer better.


Regards

Prasanth


Scarmozzino__Robert
404 Views

The launch command is simply:

        mpiexec -np 1 programA

In this test case, there is in fact no MPI program.  I am just using the launcher to spawn programA, which is just a shell script.  In the real application, programA is an actual MPI program, which spawns programB in the background, intending to leave it when it completes.  In the real application, all of the MPI stuff completes, but it is waiting and shows as defunct because of programB.  The simple example I sent demonstrates the essence of the issue I think.

PrasanthD_intel
Moderator
370 Views

Hi Robert,


Sorry for the delay,

We are not sure whether this is expected behaviour or not.

We are discussing with the internal team and get back to you after cross-checking with MPI 3.1 standard.

Thanks for being patient.


Regards

Prasanth


PrasanthD_intel
Moderator
354 Views

Hi Robert,


The behaviour for launching inside a shell script is undefined but since you are asking for a way around it.

We are transferring this query to Subject Matters experts for better support.


Regards

Prasanth


Kevin_O_Intel1
Employee
352 Views

investigating if there is a workaround


Kevin_O_Intel1
Employee
343 Views

I have discussed this with the team here.

This seems like standard linux process behavior... unclear if there is a workaround

Is there any way you could launch the exec instead of the script?


Reply