- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
With Intel MPI-2019.6.166 (IPSXE 2020.0.166, Mellanox HDR, MLNX_OFED_LINUX-4.7-3.2.9.0) getting 2.5x slower performance, compared to another cluster with Intel MPI 2019.1 (Mellanox EDR, MLNX_OFED_LINUX-5.0-2.1.8.0).
I'm suspecting that Intel MPI-2019.6.166 is not picking right IB transport. What values from below need to be set for UCX_TLS env variable in mpiexec?
$ ucx_info -d | grep Transport
# Transport: posix
# Transport: sysv
# Transport: self
# Transport: tcp
# Transport: tcp
# Transport: rc
# Transport: rc_mlx5
# Transport: dc_mlx5
# Transport: ud
# Transport: ud_mlx5
# Transport: cm
# Transport: cma
# Transport: knem
Also, OpenMPI has env var UCX_NET_DEVICES=mlx5_0:1 to set what IB interface to use. Please let me know similar variable for Intel MPI-2020.
# ibstat
CA 'mlx5_0'
CA type: MT4123
- Tags:
- Cluster Computing
- General Support
- Intel® Cluster Ready
- Message Passing Interface (MPI)
- Parallel Computing
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Could you please share the benchmark program/code that you are using to compare MPI-2019.6.166 with MPI 2019.1.
For selecting the transport we suggest you go through the following link
Regarding UCX_NET_DEVICES in Intel MPI, we will get back to you.
Thanks
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
The Intel MPI uses UCX in the backend for Infiniband. The UCX commands are not specific for OpenMPI.
Also regarding the slower performance of IMPI 2019u6 could you once check the performance after changing the provider to verbs.
FI_PROVIDER=verbs
We have also made some improvements with mlx in 2019u7. If possible can you upgrade and check with the latest version and see if the performance improves.
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Prasanth.
The performance issue is resolved now. Issue may be with firmware or infiniband drivers. We tested performance with IntelMPI-2018u5 and IntelMPI-2019u6. IntelMPI-2018u5 is slightly faster than IntelMPI-2019u6.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sangamesh,
Glad to hear that your issue has been resolved.
We suggest using the latest version of Intel MPI (2019u7) instead of IMPI 2018u5.
Shall we close this thread considering your issue has been resolved?
Regards
Prasanth
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Sangamesh,
We are closing this thread.
Please raise a new thread for further queries.
Regards
Prasanth
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page