While using MPI with I_MPI_FABRIC as shm:ofa with EDR & OFED 4.6-22.214.171.124. The RDMA pull hangs.
An another incident was noticed with different version of software having same combinations, memory corruption happens.
Is there any problem with latest OFED (4.6) with Intel MPI (OFA)?
Note: The same software run fine with DAPL fabric selection.
Thanks in advance.
We tried reproducing your issue. we were unable to use OFA as it is depreciated in the latest version of Intel MPI. So can you try it using OFI?
Please let us know if it is working.
Please find the environment details:
MPI Version: Intel 5.0.3
OFED Version: MLNX_OFED_LINUX-4.6-126.96.36.199 (OFED-4.6-1.0.1)
CA type: MT4119
Number of ports: 1
Firmware version: 16.25.1020
Hardware version: 0
Node GUID: 0xb8599f03001230cc
System image GUID: 0xb8599f03001230cc
Physical state: LinkUp
Base lid: 2
SM lid: 3
Capability mask: 0x2651e84a
Port GUID: 0xb8599f03001230cc
Link layer: InfiniBand
MPI Launch Command:
mpirun -l -print-all-exitcodes -cleanup -genv I_MPI_EAGER_THRESHOLD=524288 -genv I_MPI_MPIRUN_CLEANUP 1 -genv I_MPI_OFA_USE_XRC 0 -genv I_MPI_FABRICS=shm:ofa -n 1 -env I_MPI_OFA_NUM_ADAPTERS 1 -env I_MPI_OFA_ADAPTER_NAME mlx5_0 -host 192.168.1.5 <application> : -n 1 -env I_MPI_OFA_NUM_ADAPTERS 1 -env I_MPI_OFA_ADAPTER_NAME mlx4_0 -host 192.168.1.10 <application> : -n 1 -env I_MPI_OFA_NUM_ADAPTERS 1 -env I_MPI_OFA_ADAPTER_NAME mlx4_0 -host 192.168.1.10 <application> : -n 1 -env I_MPI_OFA_NUM_ADAPTERS 1 -env I_MPI_OFA_ADAPTER_NAME mlx5_0 -host 192.168.1.10 <application> : etc....
 [1#17091:17093@IMCNode001] MPI startup(ofa_utility.c:495): Start 1 ports per adapter
 [1#17091:17093@IMCNode001] MPID_nem_ofacm_init(ofa_init_cm.c:126): Init
 [1#17091:17093@IMCNode001] MPI startup(mpid_nem_init.c:481): shm and ofa data transfer modes
The below details may help you:
Getting below errors from almost all nodes in cluster:
Errors for "Node001 HCA-2"
GUID 0xb8599f030005e1c0 port 1: [PortXmitWait == 4294967295]
Errors for "Node000 HCA-1"
GUID 0xe41d2d0300105351 port 1: [PortXmitWait == 4294967295]
Errors for "Node001 HCA-1"
GUID 0xe41d2d0300105481 port 1: [PortXmitWait == 4294967295]
One node has following instance:
GUID 0xcc47affff5f27e5 port 1: [VL15Dropped == 1] [PortXmitWait == 4956673]
Please also find the call stack from MPI level:
#0 0x00007f3f93dc7428 in pthread_mutex_lock () from /lib64/libpthread.so.0
#1 0x00007f3f94155e80 in dapl_evd_dequeue () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#2 0x00007f3f942c389b in ofacm_query_conn_evd () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#3 0x00007f3f942cf8f5 in MPID_nem_gen2_module_poll () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#4 0x00007f3f9429e6e0 in MPID_nem_network_poll () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#5 0x00007f3f940ec317 in PMPIDI_CH3I_Progress () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#6 0x00007f3f94112e17 in MPIDI_Win_unlock () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
#7 0x00007f3f94394e46 in PMPI_Win_unlock () from /home/klac/mount_path_Tool/Intel/mpi/lib-IB/libmpi.so.12
Environment variable -genv I_MPI_FABRICS=shm:ofa , will work only with the later version of IMPI, which is 2019. If you have to stick to IMPI 5.0.3 then please use DAPL or OFI.
Please let me know if it solves your problem.
Could you please confirm the below points?
1. Does IMPI 5.0.3 will support OFI? I belive OFI got introduced in IMPI 2018.
2. I have a stable system working with following combinations:
IMPI : 5.0.3
OFED : 2.6
3. I have issue (MPI Hang) with below combination:
IMPI : 5.0.3
OFED : 4.6
DAPL is working fine, but not able to achive auto multirail functionality, if we get OFA working with OFED 4.6 + IMPI 5.0.3, this will help.
Can you clarify why you need to stay on 5.0 Update 3? This version is no longer supported, and we have made many stability and performance improvements since then.
I_MPI_FABRICS=shm:ofi should not work in 5.0 Update 3, this option was not implemented in that version.
Do you have a small reproducer code you can share?
We were trying to switch to IMPI 2018 and had issues and Intel given recommendation to try with IMPI 2019.
We are in process of evaluating IMPI 2019, but this will take little while, until we have to stick to IMPI 5.
So, we need some quick turnaround to avoid this issue.
You indicated that you are looking for multi-rail support. Have you tried the simplified launch settings in https://software.intel.com/en-us/articles/tuning-the-intel-mpi-library-basic-techniques#inpage-nav-3... and let mpirun assign ranks to adapters?