Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2275 Discussions

MPI segfaults with fabric other than shm

Bennet
New User
314 Views

I am trying to get the Intel MPI installed as part of OneAPI 2024.2.1 to work.  I have a Red Hat Enterprise Linux 9.4 installation,  kernel version 5.14.0-427.42.1.el9_4.x86_64, with the Mellanox OFED version MLNX_OFED_LINUX-24.10-2.1.8.0-rhel9.4-ext.  The ibstat utility shows that mlx5_0 is active and  using the Infinband link layer.  I can successfully run ibping and communicate with another node on the IB fabric.

These are my installation paths for OneAPI

ONEAPI_ROOT=/gpfs1/sw/rh9/pkgs/oneapi/2024.2.1
I_MPI_ROOT=/gpfs1/sw/rh9/pkgs/oneapi/2024.2.1/mpi/2021.13
DPL_ROOT=/gpfs1/sw/rh9/pkgs/oneapi/2024.2.1/dpl/2022.6
CMPLR_ROOT=/gpfs1/sw/rh9/pkgs/oneapi/2024.2.1/compiler/2024.2

I do the setup of OneAPI with source /gpfs1/sw/rh9/pkgs/oneapi/2024.2.1/setvars.sh

I am using two very simple MPI programs to test:  One is hello.c and the other does naive numerical integration.  Both work.  To stick to the simplest example,  I compile hello.c with

$ mpiicx -o hello ../hello.c
$ mpirun -np 1 ./hello

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 55105 RUNNING AT node304.cluster
= KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================

If I do not set the fabric provider, or set it to any other value than shm, I get that seg fault.  Running this works fine.

$ export I_MPI_FABRICS=shm
$ mpirun -np 1 ./hello
Hello world from processor node304.cluster, rank 0 out of 1 processors

$ mpirun -np 2 ./hello
Hello world from processor node304.cluster, rank 1 out of 2 processors
Hello world from processor node304.cluster, rank 0 out of 2 processors
etc.

I have seen many different suggestions for what to try to get more information and to get things to work.  Adding I_MPI_DEBUG shows some more information.

$ I_MPI_DEBUG=15 mpirun -np 1 ./hello 2>&1 | grep -v 'not supported'
[0] MPI startup(): Intel(R) MPI Library, Version 2021.13  Build 20240701 (id: 179630a)
[0] MPI startup(): Copyright (C) 2003-2024 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric loaded: libfabric.so.1 
[0] MPI startup(): libfabric version: 1.20.1-impi
libfabric:57838:1758824622::core:core:ze_hmem_dl_init():524<warn> Failed to dlopen libze_loader.so
libfabric:57838:1758824622::core:core:ofi_hmem_init():612<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:57838:1758824622::core:core:ze_hmem_dl_init():524<warn> Failed to dlopen libze_loader.so
libfabric:57838:1758824622::core:core:ofi_hmem_init():612<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: verbs (120.10)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: verbs (120.10)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: tcp (120.10)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: shm (120.10)
libfabric:57838:1758824622::core:core:ze_hmem_dl_init():524<warn> Failed to dlopen libze_loader.so
libfabric:57838:1758824622::core:core:ofi_hmem_init():612<warn> Failed to initialize hmem iface FI_HMEM_ZE: No data available
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: ofi_rxm (120.10)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: psm2 (120.10)
libfabric:57838:1758824622::psm3:core:fi_prov_ini():939<info> node304.cluster:rank0: build options: VERSION=706.0=7.6.0.0, HAVE_PSM3_src=1, PSM3_CUDA=0
libfabric:57838:1758824622::psm3:core:psmx3_param_get_bool():94<info> node304.cluster:rank0: variable FI_PSM3_NAME_SERVER=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_bool():94<info> node304.cluster:rank0: variable FI_PSM3_TAGGED_RMA=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_str():128<info> node304.cluster:rank0: read string var FI_PSM3_UUID=eae10000-0d4a-d644-a43f-0600f4993154
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():113<info> node304.cluster:rank0: read int var FI_PSM3_DELAY=0
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():109<info> node304.cluster:rank0: variable FI_PSM3_TIMEOUT=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():109<info> node304.cluster:rank0: variable FI_PSM3_PROG_INTERVAL=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_str():124<info> node304.cluster:rank0: variable FI_PSM3_PROG_AFFINITY=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():113<info> node304.cluster:rank0: read int var FI_PSM3_INJECT_SIZE=32768
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():113<info> node304.cluster:rank0: read int var FI_PSM3_LOCK_LEVEL=0
libfabric:57838:1758824622::psm3:core:psmx3_param_get_bool():94<info> node304.cluster:rank0: variable FI_PSM3_LAZY_CONN=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_int():109<info> node304.cluster:rank0: variable FI_PSM3_CONN_TIMEOUT=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_bool():94<info> node304.cluster:rank0: variable FI_PSM3_DISCONNECT=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_str():124<info> node304.cluster:rank0: variable FI_PSM3_TAG_LAYOUT=<not set>
libfabric:57838:1758824622::psm3:core:psmx3_param_get_bool():94<info> node304.cluster:rank0: variable FI_PSM3_YIELD_MODE=<not set>
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: psm3 (706.0)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: mlx (1.4)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: ofi_hook_noop (120.10)
libfabric:57838:1758824622::core:core:ofi_register_provider():513<info> registering provider: off_coll (120.10)
libfabric:57838:1758824622::core:core:fi_getinfo_():1368<info> Found provider with the highest priority psm2, must_use_util_prov = 0
libfabric:57838:1758824622::core:core:fi_getinfo_():1437<info> Start regular provider search because provider with the highest priority psm2 can not be initialized

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 57838 RUNNING AT node304.cluster
=   KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================

So, it seems from the registering provider lines above that there are providers being detected.  I tried changing the provider by running

$ I_MPI_DEBUG=15 FI_PROVIDER=mlx mpirun -np 1 ./hello
    [ . . . . ]
libfabric:58832:1758825483::core:core:ofi_register_provider():513<info> registering provider: mlx (1.4)
libfabric:58832:1758825483::core:core:ofi_hmem_init():607<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:58832:1758825483::core:core:ofi_hmem_init():607<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:58832:1758825483::core:core:ofi_hmem_init():607<info> Hmem iface FI_HMEM_ZE not supported
libfabric:58832:1758825483::core:core:ofi_hmem_init():607<info> Hmem iface FI_HMEM_NEURON not supported
libfabric:58832:1758825483::core:core:ofi_hmem_init():607<info> Hmem iface FI_HMEM_SYNAPSEAI not supported
libfabric:58832:1758825483::core:core:ofi_register_provider():513<info> registering provider: ofi_hook_noop (120.10)
libfabric:58832:1758825483::core:core:ofi_register_provider():513<info> registering provider: off_coll (120.10)
libfabric:58832:1758825483::core:core:fi_getinfo_():1368<info> Found provider with the highest priority mlx, must_use_util_prov = 0
[0] MPI startup(): max_ch4_vnis: 1, max_reg_eps 64, enable_sep 0, enable_shared_ctxs 0, do_av_insert 0
[0] MPI startup(): max number of MPI_Request per vci: 67108864 (pools: 1)
libfabric:58832:1758825483::core:core:fi_getinfo_():1368<info> Found provider with the highest priority mlx, must_use_util_prov = 0
[0] MPI startup(): libfabric provider: mlx
libfabric:58832:1758825483::core:core:fi_fabric_():1665<info> Opened fabric: mlx

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 58832 RUNNING AT node304.cluster
=   KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================

So, it seems to see that I've asked for a different provider, now using mlx, and is says it Opened fabric: mlx, but it once again Seg faults.

As mentioned above, this fails with any fabric specified other than shm.  I will also observe that the same binary produced by the above mpiicx command runs fine on a node without an Infiniband card.

$ mpirun ./hello 
Hello world from processor node253.cluster, rank 1 out of 4 processors
Hello world from processor node253.cluster, rank 0 out of 4 processors
Hello world from processor node254.cluster, rank 2 out of 4 processors
Hello world from processor node254.cluster, rank 3 out of 4 processors

This seems to me to indicate that there is something specifically to do with the IB fabric providers and/or the software I have installed interacting (or not interacting) with Intel MPI that is the source of the problem.

Can someone help me both understand what the problem is, and figure out how to resolve it?

Thanks in advance,    -- bennet

 

0 Kudos
1 Reply
Sergey_K_Intel3
Employee
269 Views

Hi,

 

Thanks for the detailed issue report!

 

Can you please try reproducing the issue with newer Intel MPI version, e.g. 2021.16 or 2021.16.1?

Also, other than I_MPI_DEBUG, do you have some other I_MPI_* environment variables defined?

Finally, it would be useful to get a back trace of the process that is segfaulting, e.g. you could enable core files generation, run gdb </path/to/executable_that_crashes> </path/to/core_file>, and type 'bt' command to generate the backtrace.

 

Best regards,

Sergey

0 Kudos
Reply