Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2273 Discussions

Intermittent Failure to Launch "mpiexec" with Intel MPI 2021.3.0

Hugo_Choe
Beginner
1,992 Views

Dear Admin or Users,

 

As stated in the subject, I occasionally encounter an error during the mpiexec process. The error message is as follows:

May 23 07:33:00 2025 1005574 3 10.1 lsb_launch(): Failed while waiting for tasks to finish.
[mpiexec@myhost] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:537): downstream from host myhost exited with status 255
[mpiexec@myhost] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2125): assert (pg->intel.exitcodes != NULL) failed
[mpiexec@myhost] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write error (Bad file descriptor)

This error occurs approximately once every three runs. Initially, I suspected the LSF deployment step, so I changed I_MPI_HYDRA_BOOTSTRAP to ssh. However, the same error still appears (except first line of the above message).

Are there any well-known issues about this? Or it is also very helpful if someone give me some advice to resolve this issue.

 

Best Regards,

Hugo

0 Kudos
3 Replies
TobiasK
Moderator
1,869 Views

@Hugo_Choe 
does this error also occur with the latest version of IMPI, 2021.15? 

0 Kudos
Hugo_Choe
Beginner
1,788 Views

@TobiasK 

I have not tested the latest version yet, but even in version 2021.2.0, this issue does not occur. I will test the latest version.
However, since many users are already using the installed version (2021.3.0), it would be preferable to resolve this with a patch if this is a well-known issue or something.

0 Kudos
TobiasK
Moderator
1,470 Views

@Hugo_Choe 
sorry but 2021.3 is way too old. If the issue still exists in 2021.16, we will investigate the issue.

0 Kudos
Reply