Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Greg_R_
Beginner
470 Views

MPI error on 2019 Beta version

 

I am the admin of a small cluster, trying to help a user. The user has been able to compile and run this code on a Cray with Intel XE 2017, update 7.

On our local cluster, we only had 2017, update 1. So I decided to install the 2019 Beta version. The user was able to compile, but in running she received this error:

 

mpirun -np 16 /users/kings/navgem-x/bin/sst_interp

Abort(1618063) on node 1 (rank 1 in comm 0): Fatal error in MPI_Init: Other MPI error, error stack:

MPIR_Init_thread(613)..........:

MPID_Init(378).................:

MPIDI_NM_mpi_init_hook(1047)...:

MPIDI_OFI_create_endpoint(1873): OFI EP enable failed (ofi_init.h:1873:MPIDI_OFI_create_endpoint:Cannot allocate memory)

Abort(1618063) on node 12 (rank 12 in comm 0): Fatal error in MPI_Init: Other MPI error, error stack:

MPIR_Init_thread(613)..........:

MPID_Init(378).................:

MPIDI_NM_mpi_init_hook(1047)...:

MPIDI_OFI_create_endpoint(1873): OFI EP enable failed (ofi_init.h:1873:MPIDI_OFI_create_endpoint:Cannot allocate memory)

 

 

 

Any clue to what this means? Thank you.

0 Kudos
2 Replies
abhinandan_n_
Beginner
470 Views

Hi,

I hitting a similar issue while using Intel MPI 2019 on SLES 15 operating system. I request intel executive to help us as early as possible since our progress is blocked because of the issue. 

/usr/diags/mpi/impi/2019.0.117/bin64/mpiexec -verbose -genv LD_LIBRARY_PATH /usr/diags/mpi/impi/2019.0.117/lib64  -machinefile /tmp/mymachlist.154761.run -n 1 /usr/diags/mpi/intel/intel/bin/olconft.intel
[proxy:0:0@cfdhcp-91-13] pmi cmd from fd 0: cmd=init pmi_version=1 pmi_subversion=1
[proxy:0:0@cfdhcp-91-13] PMI response: cmd=response_to_init pmi_version=1 pmi_subversion=1 rc=0
[proxy:0:0@cfdhcp-91-13] pmi cmd from fd 0: cmd=get_maxes
[proxy:0:0@cfdhcp-91-13] PMI response: cmd=maxes kvsname_max=256 keylen_max=64 vallen_max=1024
[proxy:0:0@cfdhcp-91-13] pmi cmd from fd 0: cmd=get_appnum
[proxy:0:0@cfdhcp-91-13] PMI response: cmd=appnum appnum=0
[proxy:0:0@cfdhcp-91-13] pmi cmd from fd 0: cmd=get_my_kvsname
[proxy:0:0@cfdhcp-91-13] PMI response: cmd=my_kvsname kvsname=kvs_157792_0
Abort(1618319) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(607)......:
MPID_Init(793).............:
MPIDI_NM_mpi_init_hook(667): OFI addrinfo() failed (ofi_init.h:667:MPIDI_NM_mpi_init_hook:No data available)

 

Dmitry_G_Intel
Employee
470 Views

Hi,

Could specify which HW do you use (IB, Intel OPA, Ethernet or etc), please?

This error can happen in the following situations:

- OFI library can't find the provider's libraries (libpsm2-fi.so, libverbs-fi.so, ...) that are used to run on a specific HW. The path to them can be specified by FI_PROVIDER_PATH='dir' (if you use IMPI 2019 Gold, it should be 'mpi_path'/intel64/libfabric/lib/prov)

- If you use IMPI 2019 Beta, OFI should be newer than 1.5 (make sure that installed libfaric is newer than 1.5, otherwise, please, use libfabric from official GitHub repository - https://github.com/ofiwg/libfabric/)

- If your MPI application is going to be run on InfiniBand, IPoIB should be configured on each node (if the name of IPoIB interface is different from "ibX" (where X is 0,1,...), please, specify it by FI_VERBS_IFACE="IPoIB interface name")

Reply