Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2161 Discussions

Intel MPI 3.2 issue - "open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?"

bamboo7413
Beginner
593 Views
Hi,

I got one issue when upgrading Intel MPI library from 3.1 to 3.2?
For the same source code, there's issue when linking 3.1 library.
The error log is below:

hpc-p-19:18677: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-17:18094: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-5:18168: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-6:18157: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-15:18059: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
.............
[12] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[98] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[77] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf

Thanks!
0 Kudos
4 Replies
bamboo7413
Beginner
593 Views
Quoting - bamboo7413
Hi,

Sorrry for typo. It should be:
For the same source code, there's issue when linking 3.2 library.
But it's OK if linking 3.1 library.

Thanks!

0 Kudos
Dmitry_K_Intel2
Employee
593 Views
Quoting - bamboo7413
Hi,

I got one issue when upgrading Intel MPI library from 3.1 to 3.2?
For the same source code, there's issue when linking 3.1 library.
The error log is below:

hpc-p-19:18677: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-17:18094: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-5:18168: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-6:18157: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-15:18059: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
.............
[12] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[98] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[77] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf

Thanks!

Hi bamboo7413,

Thanks for the interest to Intel MPI Library.
It seems to me that something wrong with your environment or settings. Message "open_hca: getaddr_netdev ERROR" goes from DAPL library but not from MPI and should not depend on the MPI version.

Could you provide dat.conf and your command line? Also you can try to run your applicatin with '-genv I_MPI_DEBUG 2' to get additional debug information from the MPI library.

Regards!
Dmitry

0 Kudos
bamboo7413
Beginner
593 Views

Hi bamboo7413,

Thanks for the interest to Intel MPI Library.
It seems to me that something wrong with your environment or settings. Message "open_hca: getaddr_netdev ERROR" goes from DAPL library but not from MPI and should not depend on the MPI version.

Could you provide dat.conf and your command line? Also you can try to run your applicatin with '-genv I_MPI_DEBUG 2' to get additional debug information from the MPI library.

Regards!
Dmitry


Thanks, here's something that you need. Could you please help me to locate the issue?
1) dat.conf
OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""
OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""
OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""
OpenIB-mlx4_0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 1" ""
OpenIB-mlx4_0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 2" ""
OpenIB-ipath0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 1" ""
OpenIB-ipath0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 2" ""
OpenIB-ehca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ehca0 1" ""
OpenIB-iwarp u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""

2) command line
time mpirun -f $LSB_DJOB_HOSTFILE -r ssh -env I_MPI_DEBUG 3 -np $LSB_DJOB_NUMPROC _our_mpi_program_

Any commnet?
0 Kudos
Dmitry_K_Intel2
Employee
593 Views
Hi bamboo7413,

Could you try to comment out first 2 lines in your dat.conf file?
Like:

Thanks, here's something that you need. Could you please help me to locate the issue?
1) dat.conf
#OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
#OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""
OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""
OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""


and let me know if it helps.

Regards!
Dmitry
0 Kudos
Reply