Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
127 Views

I_MPI_OFI_PROVIDER vs. I_MPI_OFA_ADAPTER_NAME

Hi,

The latter has been removed / deprecated in intelmpi 2019.4 which we are currently testing.

We are using environment modules and up to and including intelmpi 2018.3 we had logic in them which set I_MPI_OFA_ADAPTER_NAME to the appropriate interface based on hostname.

I tried using I_MPI_OFI_PROVIDER the same way, and got an error message:

[0] MPI startup(): libfabric version: 1.7.2a-impi

Abort(1094799) on node 26 (rank 26 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(666)......:
MPID_Init(922).............:
MPIDI_NM_mpi_init_hook(719): OFI addrinfo() failed (ofi_init.h:719:MPIDI_NM_mpi_init_hook:No data available)

I guess that this is either the wrong variable or I am not using it correctly.

I believe the syntax is: I_MPI_OFI_PROVIDER=<name>

Looking at the output of I_MPI_OFI_PROVIDER_DUMP I see stuff like:

name: mlx5_0
name: IB-0xfe80000000000000
prov_name: verbs;ofi_rxm

I'd be grateful for any pointers.

Thanks!

0 Kudos
2 Replies
Highlighted
Beginner
127 Views

I've also just noticed that while I have two interfaces (one configured for ethernet, the other configured for infiniband) the output generated by passing export I_MPI_OFI_PROVIDER_DUMP=1 suggests that only the first interface (mlx5_0 ethernet) is an available provider while I actually want to use the second one (mlx5_1 infiniband).

 

0 Kudos
Highlighted
Beginner
127 Views

Having investigated further, I believe that the only provider that supports my hardware is "verbs" (as this is the only one that supports infiniband):

https://software.intel.com/en-us/articles/intel-mpi-library-2019-over-libfabric

However, unless I am reading this incorrectly, it is necessary to be running tcp over the infiniband interface. Is anyone able to confirm?

Thanks!

0 Kudos