Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
1826 Discussions

Intel MPI maximum tag value is much smaller with MLX provider than some other providers.

John_Young
New Contributor I
611 Views

Since moving to more recent Intel MPI versions that use the mlx provider (FI_PROVIDER=mlx) and shm:ofi fabric, we began noticing our code crashing for larger problems that use to run without issue. We tracked down the problem to the fact that the maximum mpi tag value when using the mlx provider is much smaller than when using the dapl fabric (Intel MPI 2018 and before) or the sockets provider and shm:ofi fabric (Intel MPI 2019 and later).

A test case and output is attached for Intel MPI 2018 through 2021. For 2018 with the dapl fabric, the maximum tag value is 2147483647. For Intel 2019 and later with the sockets provider, the maximum tag value is 1073741823. For Intel 2019 and later with the mlx provider, the maximum tag value drops all the way to 1048575.

We realize that the maximum tag value is only guaranteed to be larger than 32768, but this is a drastic drop and Intel has suggested us to use the mlx provider in other threads. We understand that the mlx provider may be outside of the Intel MPI library, but is there any way that we increase the size of the maximum tag value when using the mlx provider?

Thanks,
John

0 Kudos
4 Replies
PrasanthD_intel
Moderator
589 Views

Hi John,


Thanks for providing us with the sample code and script. Yes, we too have observed the maximum value of tag for the mlx provider has been reduced.

However, regarding a way to increase tag size, we will let you know if it is possible after contacting the internal team.

We will get back to you soon.


Regards

Prasanth


PrasanthD_intel
Moderator
535 Views

Hi John,

 

Thanks for being patient.

I am escalating this thread to the internal team.  They will be looking into the issue and will back to you soon.

 

Regards

Prasanth

Klaus-Dieter_O_Intel
488 Views

Hi John,


The maximum tag value was changed from Intel MPI 2018 to 2019 due to implementation reasons. I am sorry, but there is no option to increase the value.


You noticed already that the MPI 3.1 specification only guarantees a value of 32K (see page 27 of https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf). The actual value has to be queried by using MPI_Comm_get_attr. Not doing it creates a portability risk. If it is difficult for an application to stay in the tag values range by reusing values the suggested solution is to use more MPI communicators. For example, in a hybrid MPI/OpenMP code with message exchange between threads (MPI_THREAD_MULTIPLE) a communicator per thread pair could be used (with up to the value of the attribute MPI_TAG_UB different tags per communicator).


Regards

Klaus-Dieter


John_Young
New Contributor I
472 Views
Hi, Thank you for looking into this. Although a bit disappointing, we understand that it is out of Intel's control. We are going to re-write the problematic communication routines to lump all send/recvs from a single pair of mpi processes into a single buffer to avoid having to use the tag values. John
Reply