Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2218 Discussions

Unable to use PSM2 for OmniPath in mpi 2021.9.0

lhcamilo
Beginner
318 Views

Hi there, 

I have been struggling to get PSM2 enabled in our intel OneAPI installation.  Our nodes are connected with OmniPath, hence why I need to the PSM2 support enabled. Though OPX would also be nice. 

In any case I have done:

 

 

$ export I_MPI_FABRIC=ofi # and export I_MPI_FABRICS=shm:ofi
$ source vars.sh -i_mpi_ofi_internal=1
$ export I_MPI_OFI_PROVIDER=psm2
$ mpiexec -n 1 IMB-MPI1 -help
[0] MPI startup(): Intel(R) MPI Library, Version 2021.9  Build 20230307 (id: d82b3071db)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
Abort(2139535) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........: 
MPID_Init(1546)..............: 
MPIDI_OFI_mpi_init_hook(1518): 
open_fabric(2572)............: 
find_provider(2694)..........: OFI fi_getinfo() failed (ofi_init.c:2694:find_provider:No data available)

 

 Additionally from fi_info:

 

fi_info -l
psm2:
    version: 113.20
mlx:
    version: 1.4
psm3:
    version: 1104.1000
ofi_rxm:
    version: 113.20
verbs:
    version: 113.20
tcp:
    version: 113.20
sockets:
    version: 113.20
shm:
    version: 116.10
ofi_hook_noop:
    version: 113.20

 

 I would really appreciate some guidance. 

Labels (1)
0 Kudos
2 Replies
TobiasK
Moderator
212 Views

@lhcamilo Please try again with the latest Intel MPI 2021.13 version included in oneAPI 2024.2.1

Is the Mellanox UCX stack installed?
Why are you using 

source vars.sh -i_mpi_ofi_internal=1

Do you have another libfabric version installed and is that one present in your path?

Can you please show the output of 

which fi_info

 

0 Kudos
lhcamilo
Beginner
207 Views

Hello Tobias, 

Thank you for your response. 

UCX/1.14.1 is installed. It maybe  worth noting that we are using EasyBuild. 

> Why are you using 

source vars.sh -i_mpi_ofi_internal=1


I wanted to force  the use of intel packages libfabric (and psm2) libraries, rather than the ones installed in the system.

$ which fi_info 

$APPS/impi/2021.9.0-intel-compilers-2023.1.0/mpi/2021.9.0/libfabric/bin/fi_info

 

as far as I can tell it is using the correct libfabric

Thanks for the help 

0 Kudos
Reply