- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I am using Intel MPI version 2018.5.24 for some CFD applications with RDMA DAPL implementations.
For some applications, it runs fine for 30 mins, and then I get the kind of error below.
node08:CMA:14822:c3697b40: 1954025166 us(18158 us): DAPL ERR create_qp Address family not supported by protocol
[1443:node08][../../src/mpid/ch3/channels/nemesis/netmod/dapl/dapl_conn_rc.c:502] error(0x120063): ofa-v2-cma-roe-enp65s0np0: could not create DAPL endpoint: DAT_INVALID_ADDRESS(DAT_INVALID_ADDRESS_MALFORMED)
My environment is as below.
I_MPI_DAPL_PROVIDER=ofa-v2-cma-roe-enp65s0np0
I_MPI_DAT_LIBRARY=/usr/lib64/libdat2.so.2.0.0
I_MPI_DEBUG=5
I_MPI_FABRICS=shm:dapl
I_MPI_FALLBACK=0
DAT_override=/etc/rdma/dat.conf
where /etc/rdma/dat.conf contains
ofa-v2-cma-roe-enp65s0np0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "enp65s0np0 1" ""
output from ibdev2netdev
mlx5_0 port 1 ==> enp65s0np0 (Up)
I have no idea why only some applications return this error or why it runs fine for 30 minutes or an hour without any problem. I appreciate any slight hint for this issue.
Thank you very much in advance.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Kyungrae sorry to tell you that, but Intel MPI 2018.5.24 is ancient and really not supported anymore.

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page