Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
This community is designed for sharing of public information. Please do not share Intel or third-party confidential information here.
1921 Discussions

Loopback traffic with libfabrics-verbs

eburrows
Beginner
499 Views

I'm seeing a ton of loopback traffic on my IntelMPI-openfabrics installation using a Broadcom RoCE NIC, even with single-host tests.

 

I'm setting the fabric, provider, and device parameters as:

 -genv I_MPI_FABRICS=shm:ofi  -genv FI_PROVIDER=verbs -genv FI_VERBS_DEVICE_NAME=bnxt_re0

 

When I run a single-node test with I_MPI_FABRICS=shm, the job runs fine, but when shm:ofi is used, it seems to send all traffic through the RNIC. I believe this is causing congestion in multi-host tests. Any advice for enabling shared-memory communications for intra-node communications, and verbs for inter-node communications?

 

Intel MPI Version: Intel(R) MPI Library, Version 2019 Update 8 Build 20200624 (id: 4f16ad915)

 

0 Kudos
4 Replies
ShivaniK_Intel
Moderator
457 Views

Hi,


Thanks for reaching out to us.


>>>I'm seeing a ton of loopback traffic on my IntelMPI-openfabrics installation using a Broadcom RoCE NIC, even with single-host tests.


 Can you please elaborate more on this issue?


 Does this affect the performance of your application? If yes, can you please provide the performance details and steps to reproduce?


>>>Any advice for enabling shared-memory communications for intra-node communications, and verbs for inter-node communications?


The environment variables you have been using are correct but for better practice, we recommend you use the command as below



I_MPI_FABRICS=shm:ofi FI_PROVIDER=verbs FI_VERBS_DEVICE_NAME=bnxt_re0 mpirun -np <no. of processes> ./a.out


For more details, you can refer to the below link.


https://software.intel.com/content/www/us/en/develop/articles/intel-mpi-library-2019-over-libfabric....


Thanks & Regards

Shivani



eburrows
Beginner
442 Views

Hi Shivani,

 

> Can you please elaborate more on this issue? Does this affect the performance of your application? If yes, can you please provide the performance details and steps to reproduce?

 

This is the Ansys Fluent tool, running the "Aircraftwing" benchmark.  Performance is OK, but about 20% of the time, the job hangs on startup, with each thread busy-waiting over poll_cq() and writes to /dev/infiniband/rdma_cm.

I do not expect to see local traffic in the NIC counters, and I suspect this is causing congestion and possible flooding of RDMACM.  I understand MPI_THREAD_SPLIT mode is intended to work this way, but I do not believe I am running the release_mt or debug_mt platforms required for THREAD_SPLIT.

 

> The environment variables you have been using are correct but for better practice, we recommend you use the command as below

 

Thanks, I changed to this model of invoking mpirun, but as expected, did not see any change in behavior.

James_T_Intel
Moderator
400 Views

In some versions, ANSYS does link with release_mt. ANSYS usually forces linkage with the specific version they ship, please check if this is the release_mt or release path in your version.


James_T_Intel
Moderator
280 Views

I apologize for the delay in updating this thread. In phone discussions, this appears to be resolved by using Intel® MPI Library 2021.3 with I_MPI_OFI_PROVIDER=psm3.


I will be closing this thread for Intel support. Any further replies will be considered community only.


Reply