Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
2275 Discussions

Intermittent Failure to Launch "mpiexec" with Intel MPI 2021.3.0

Hugo_Choe
Beginner
2,137 Views

Dear Admin or Users,

 

As stated in the subject, I occasionally encounter an error during the mpiexec process. The error message is as follows:

May 23 07:33:00 2025 1005574 3 10.1 lsb_launch(): Failed while waiting for tasks to finish.
[mpiexec@myhost] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:537): downstream from host myhost exited with status 255
[mpiexec@myhost] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2125): assert (pg->intel.exitcodes != NULL) failed
[mpiexec@myhost] HYD_sock_write (../../../../../src/pm/i_hydra/libhydra/sock/hydra_sock_intel.c:360): write error (Bad file descriptor)

This error occurs approximately once every three runs. Initially, I suspected the LSF deployment step, so I changed I_MPI_HYDRA_BOOTSTRAP to ssh. However, the same error still appears (except first line of the above message).

Are there any well-known issues about this? Or it is also very helpful if someone give me some advice to resolve this issue.

 

Best Regards,

Hugo

0 Kudos
3 Replies
TobiasK
Moderator
2,014 Views

@Hugo_Choe 
does this error also occur with the latest version of IMPI, 2021.15? 

0 Kudos
Hugo_Choe
Beginner
1,933 Views

@TobiasK 

I have not tested the latest version yet, but even in version 2021.2.0, this issue does not occur. I will test the latest version.
However, since many users are already using the installed version (2021.3.0), it would be preferable to resolve this with a patch if this is a well-known issue or something.

0 Kudos
TobiasK
Moderator
1,615 Views

@Hugo_Choe 
sorry but 2021.3 is way too old. If the issue still exists in 2021.16, we will investigate the issue.

0 Kudos
Reply