Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
1938 Discussions

mpiexec.hydra of intel2018/2021 not starting at RHEL8.6

B_J
Beginner
459 Views

After upgrade of OS as RHEL8.6, we found that mpiexec.hydra of older intel MPI is not working. 
Actually it doesn't start at all -  `mpiexec.hydra -n 2 hostname`  just hangs. Note that I didn't use MPI application. 

We see this hang at intel mpi 2018, 2021 while intel mpi 2022 is working OK.

We need to use older intel compiler due to legacy support, and would there be any way to make older version of mpiexec.hydra work?

 

Any comments are appreciated. 

 

B.

Labels (1)
0 Kudos
5 Replies
SantoshY_Intel
Moderator
429 Views

Hi,


Thank you for posting in Intel Communities.


Thanks for reporting this issue. We were able to reproduce it and we have informed the development team about it.


Thanks & Regards,

Santosh


B_J
Beginner
351 Views

Hi Santosh,

 

I would like to ask if you have received any update from the development team?

We're still struggling to make mpiexec.hydra work while we are not successful yet.

Any comments are appreciated.

 

Best regards,

 

B.

CyfronetHPCTeam
Beginner
314 Views

Hello,

We got the same problem after updating EL8 kernels on our cluster.

Problem is related to a new way of reporting file size for:
/sys/devices/system/node/node*/cpulist
It used to be 4096, after the update applied in kernel 4.18.0-359.el8 it reports 0.

This causes mpiexec (hydra_proxy) to go into the infinite loop 

Problematic commit is here
* Mon Jan 10 2022 Augusto Caringi <acaringi@redhat.com> [4.18.0-359.el8] - drivers/base/node.c: use bin_attribute to break the size limitation of cpumap ABI (Phil Auld) [1920645]

We have applied the kernel patch to revert to an old behavior:

https://lore.kernel.org/lkml/CAGsJ_4yb5Z3msMgXRZpSXLFiysQdJq-n_p9B6d-p2t_-_UHhVQ@mail.gmail.com/T/#u

and rebuilt the kernel rpms. Reverting to an older kernel is also an option if you don't care for security issues very much (i.e well isolated cluster)

I hope that helps

Best Regards


B_J
Beginner
292 Views

Hi,

 

Thank you for sharing the detailed information.

Using your information, we contacted RHEL support for this but looks like we don't have clear solution at this moment - using customized kernel might not be allowed.

We are planning to re-install 8.4 from scratch for now.
Will update community if there is any patch from RHEL side.

 

Best regards,

 

B.

SantoshY_Intel
Moderator
331 Views

Hi,


The Intel Developers are still working on your issue. I will update you in the community forum if there is any update regarding your issue.


Thanks & Regards,

Santosh


Reply