Community
cancel
Showing results for 
Search instead for 
Did you mean: 
bamboo7413
Beginner
79 Views

Intel MPI 3.2 issue - "open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?"

Hi,

I got one issue when upgrading Intel MPI library from 3.1 to 3.2?
For the same source code, there's issue when linking 3.1 library.
The error log is below:

hpc-p-19:18677: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-17:18094: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-5:18168: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-6:18157: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-15:18059: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
.............
[12] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[98] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[77] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf

Thanks!
0 Kudos
4 Replies
bamboo7413
Beginner
79 Views

Quoting - bamboo7413
Hi,

Sorrry for typo. It should be:
For the same source code, there's issue when linking 3.2 library.
But it's OK if linking 3.1 library.

Thanks!

Dmitry_K_Intel2
Employee
79 Views

Quoting - bamboo7413
Hi,

I got one issue when upgrading Intel MPI library from 3.1 to 3.2?
For the same source code, there's issue when linking 3.1 library.
The error log is below:

hpc-p-19:18677: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-17:18094: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-5:18168: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-6:18157: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
hpc-p-15:18059: open_hca: getaddr_netdev ERROR: Connection refused. Is ib0 configured?
.............
[12] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[98] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf
[77] MPI startup(): DAPL provider OpenIB-mthca0-1 specified in DAPL configuration file /etc/dat.conf

Thanks!

Hi bamboo7413,

Thanks for the interest to Intel MPI Library.
It seems to me that something wrong with your environment or settings. Message "open_hca: getaddr_netdev ERROR" goes from DAPL library but not from MPI and should not depend on the MPI version.

Could you provide dat.conf and your command line? Also you can try to run your applicatin with '-genv I_MPI_DEBUG 2' to get additional debug information from the MPI library.

Regards!
Dmitry

bamboo7413
Beginner
79 Views


Hi bamboo7413,

Thanks for the interest to Intel MPI Library.
It seems to me that something wrong with your environment or settings. Message "open_hca: getaddr_netdev ERROR" goes from DAPL library but not from MPI and should not depend on the MPI version.

Could you provide dat.conf and your command line? Also you can try to run your applicatin with '-genv I_MPI_DEBUG 2' to get additional debug information from the MPI library.

Regards!
Dmitry


Thanks, here's something that you need. Could you please help me to locate the issue?
1) dat.conf
OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""
OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""
OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""
OpenIB-mlx4_0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 1" ""
OpenIB-mlx4_0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 2" ""
OpenIB-ipath0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 1" ""
OpenIB-ipath0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 2" ""
OpenIB-ehca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ehca0 1" ""
OpenIB-iwarp u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""

2) command line
time mpirun -f $LSB_DJOB_HOSTFILE -r ssh -env I_MPI_DEBUG 3 -np $LSB_DJOB_NUMPROC _our_mpi_program_

Any commnet?
Dmitry_K_Intel2
Employee
79 Views

Hi bamboo7413,

Could you try to comment out first 2 lines in your dat.conf file?
Like:

Thanks, here's something that you need. Could you please help me to locate the issue?
1) dat.conf
#OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
#OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""
OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""
OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""


and let me know if it helps.

Regards!
Dmitry
Reply