- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I was trying to run benchmarks/imb/src_c/IMB-MPI1 with Intel MPI 2021.3.0 (as well as 2021.4.0) and got the following error:
[wxhao@dec0100pl5app src_c]$ mpirun -np 4 ./IMB-MPI1
dec0100pl5app:rank1.IMB-MPI1: Unable to create send CQ of size 5080 on mlx5_0: Cannot allocate memory
dec0100pl5app:rank0.IMB-MPI1: Unable to create send CQ of size 5080 on mlx5_0: Cannot allocate memory
dec0100pl5app:rank1.IMB-MPI1: Unable to initialize verbs
dec0100pl5app:rank1: PSM3 can't open nic unit: 0 (err=23)
Abort(1615503) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(138)........:
MPID_Init(1169)..............:
MPIDI_OFI_mpi_init_hook(1807):
create_endpoint(2473)........: OFI endpoint open failed (ofi_init.c:2473:create_endpoint:Invalid argument)
The code ran fine with Intel MPI 2021.2.0
I am wondering what do I need to do to make it run. Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
Could you please confirm the libfabric provider(mlx/psm3/verbs) you are using?
Could you please try the below steps:
ibv_devinfo -v
You can find the value of max_cq using the above command as highlighted in the attached screenshot below.
If the value-of-max_cq is less than 5080, then try setting:
export UCX_RC_TX_CQ_LEN=value-of-max_cq
Now, try to run the IMB-MPI1 benchmark.
Could you please try the above steps and let us know whether it works as expected?
If you still face any issues, could you please let us know whether you are able to run a "sample mpi hello world" program?
Could you please provide us with the OS details along with the results for the below command? Also, please let us know how many nodes you are using for running the MPI benchmark.
I_MPI_DEBUG=30 mpirun -v -n <total-no-of-processes> -ppn <no-of-processes-per-node> IMB-MPI1
Thanks & Regards,
Santosh
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for reaching out to us.
Could you please confirm the libfabric provider(mlx/psm3/verbs) you are using?
Could you please try the below steps:
ibv_devinfo -v
You can find the value of max_cq using the above command as highlighted in the attached screenshot below.
If the value-of-max_cq is less than 5080, then try setting:
export UCX_RC_TX_CQ_LEN=value-of-max_cq
Now, try to run the IMB-MPI1 benchmark.
Could you please try the above steps and let us know whether it works as expected?
If you still face any issues, could you please let us know whether you are able to run a "sample mpi hello world" program?
Could you please provide us with the OS details along with the results for the below command? Also, please let us know how many nodes you are using for running the MPI benchmark.
I_MPI_DEBUG=30 mpirun -v -n <total-no-of-processes> -ppn <no-of-processes-per-node> IMB-MPI1
Thanks & Regards,
Santosh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Santosh,
Thanks for the help.
My issue is in the incorrect libfabric provider. I did not select a provider and just use the default. We are using IPoIB on a Mallenox card. With Intel MPI 2021.2.0, the default is TCP and I was able to run my code. With Intel MPI 2021.3.0 and 2021.4.0, the default is PSM3 and I got the error reported earlier. By setting I_MPI_OFI_PROVIDER to TCP, the code ran fine.
Winston
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for accepting our solution. Glad to know that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel.
Thanks & Regards,
Santosh

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page