Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
2275 Discussions

Unable to use PSM2 for OmniPath in mpi 2021.9.0

lhcamilo
Beginner
928 Views

Hi there, 

I have been struggling to get PSM2 enabled in our intel OneAPI installation.  Our nodes are connected with OmniPath, hence why I need to the PSM2 support enabled. Though OPX would also be nice. 

In any case I have done:

 

 

$ export I_MPI_FABRIC=ofi # and export I_MPI_FABRICS=shm:ofi
$ source vars.sh -i_mpi_ofi_internal=1
$ export I_MPI_OFI_PROVIDER=psm2
$ mpiexec -n 1 IMB-MPI1 -help
[0] MPI startup(): Intel(R) MPI Library, Version 2021.9  Build 20230307 (id: d82b3071db)
[0] MPI startup(): Copyright (C) 2003-2023 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
Abort(2139535) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........: 
MPID_Init(1546)..............: 
MPIDI_OFI_mpi_init_hook(1518): 
open_fabric(2572)............: 
find_provider(2694)..........: OFI fi_getinfo() failed (ofi_init.c:2694:find_provider:No data available)

 

 Additionally from fi_info:

 

fi_info -l
psm2:
    version: 113.20
mlx:
    version: 1.4
psm3:
    version: 1104.1000
ofi_rxm:
    version: 113.20
verbs:
    version: 113.20
tcp:
    version: 113.20
sockets:
    version: 113.20
shm:
    version: 116.10
ofi_hook_noop:
    version: 113.20

 

 I would really appreciate some guidance. 

Labels (1)
0 Kudos
2 Replies
TobiasK
Moderator
822 Views

@lhcamilo Please try again with the latest Intel MPI 2021.13 version included in oneAPI 2024.2.1

Is the Mellanox UCX stack installed?
Why are you using 

source vars.sh -i_mpi_ofi_internal=1

Do you have another libfabric version installed and is that one present in your path?

Can you please show the output of 

which fi_info

 

0 Kudos
lhcamilo
Beginner
817 Views

Hello Tobias, 

Thank you for your response. 

UCX/1.14.1 is installed. It maybe  worth noting that we are using EasyBuild. 

> Why are you using 

source vars.sh -i_mpi_ofi_internal=1


I wanted to force  the use of intel packages libfabric (and psm2) libraries, rather than the ones installed in the system.

$ which fi_info 

$APPS/impi/2021.9.0-intel-compilers-2023.1.0/mpi/2021.9.0/libfabric/bin/fi_info

 

as far as I can tell it is using the correct libfabric

Thanks for the help 

0 Kudos
Reply