Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Erroneus [pmi_proxy] <defunct> left behind

Nils_M_
Beginner
552 Views

My application makes heavy use of MPI_Comm_spawn calls to dynamically create and abandon processes.

I am using Intel(R) MPI Library for Linux* OS, Version 4.1 Update 1 Build 20130522 on a Linux Cluster environment.

Each subsequent call of MPI_Comm_spawn unfortunately leaves a

 [pmi_proxy] <defunct>

process behind, even if the subprocess has finished normally. These processes will be killed when the whole application finishes. They do not take in any resources. Since I make about 2000 MPI_Comm_spawn calls, these can become a serious and hard to detect bug if the OS reaches its file handle limit.

Searching the Web gives certain results on the mpich bug tracker, namely ticket 670 and 1504 (spam filter prevents me from posting convenient links) and the mpich discussion board:

http://lists.mpich.org/pipermail/discuss/2013-March/000515.html

Could this still be an issue in the hydra implementation used by intel mpi?

Thank you very much for your help!

0 Kudos
2 Replies
Dmitry_S_Intel
Moderator
552 Views

Hi,

Thank you for the message.

Please submit the ticket against this issue on Intel(R) Premier Support.

--

Dmitry

0 Kudos
okkebas
Beginner
552 Views

Seems this issue still persists for intel MPI 5.0.3.048

are there any workarounds to fix the issue. I'm also spawning lots of mpi processes dynamicall and it will hit the ulimit -u

Thanks.

 

0 Kudos
Reply