Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2272 ディスカッション

Intel oneAPI MPI runtime 2021.6 -- mpiexec.hydra crashes when I_MPI_HYDRA_TOPOLIB=ipl

AlefRome
初心者
3,029件の閲覧回数

Hi All,

 

we would like to bring to your attention that the Intel  MPI Library runtime (we tried both versions 2021.5 and 2021.6) produces a crash when mpiexec.hydra is ran. The crash happens in the following conditions:

 

  • Operating System: CentOS 7, RHEL 7, RHEL 8, Oracle Linux 7 and Oracle Linux 8
  • the environment variable I_MPI_HYDRA_TOPOLIB is set to "ipl"
  • the crash occurs regardless of the number of processes (-np)
  • the crash occurs with any executable

 

Note: on the same virtual machines, using the Intel MPI runtime 2018.4 is working fine.

 

In the two archives attached to this thread, you can find the evidences for Oracle Linux 7 and Oracle Linux 8. More in detail, each .tar.gz file contains the following:

  • hwloc-ls.txt: output from the hwloc-ls command
  • lstopo.xml: the hwloc topology exported in XML format (so you can import it to reproduce in your lab)
  • lscpu.txt: output from the lscpu command
  • mpiexec-hydra-valgrind.txt: output of the mpiexec.hydra with the environment variable HYDRA_BSTRAP_VALGRIND=1
  • vgcore.XXXXXX: core dump produced by Valgrind after the crash

 

Note: the reason why we would like to use IPL as topology library is because the hwloc implementation seems to be buggy and not reporting the correct CPU topology. This leads to incorrect process pinning, where all processes are assigned to CPU #0.

 

Many thanks in advance.

Best Regards.

Pietro.

ラベル(1)
  • MPI

0 件の賞賛
1 解決策
DrAmarpal_K_Intel
従業員
2,186件の閲覧回数

Hello! This issue has been fixed in the latest release of Intel MPI. hwloc (default) should also work as expected.


元の投稿で解決策を見る

4 返答(返信)
SantoshY_Intel
モデレーター
2,983件の閲覧回数

Hi,

 

Thank you for posting in Intel Communities.

 

We were able to reproduce your issue at our end using the Intel MPI Library 2021.6 on a Centos 7 machine as shown in below screenshot:

SantoshY_Intel_0-1653307930793.png

We are working on your issue and will get back to you soon.

 

Thanks & Regards,

Santosh

 

 

 

 

SantoshY_Intel
モデレーター
2,940件の閲覧回数

Hi,


We have reported this issue to the concerned development team. They are looking into your issue.


Thanks & Regards,

Santosh


DrAmarpal_K_Intel
従業員
2,187件の閲覧回数

Hello! This issue has been fixed in the latest release of Intel MPI. hwloc (default) should also work as expected.


DrAmarpal_K_Intel
従業員
2,184件の閲覧回数

“This issue has been resolved and we will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only”.



返信