Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2154 Discussions

Intel MPI hangs with empty sent derived datatypes

gkarpa
Beginner
1,300 Views

 

Hello all,

In our project, we have encountered a behavior that seems a bit strange so I would like to request some feedback.

We are using a collective operation (MPI_Gatherv) in which each process sends 1 derived datatype. Depending on the application's calculations, this datatype may be empty of data in some processes. The receiver then adjusts accordingly and states that he'll receive 0 elements from these processes.

Running such an example with Intel MPI 2019.7.216 in Windows 10 x64 is resulting in a hang and the operation never completes. To my understanding, this scenario should be working since the type signatures implied on both sides are matching: The sender sends 1 derived type of 0 elements, so 0 bytes, and the receiver expects 0 elements of a primitive data type, which is 0 bytes as well.

Changing the sendcount argument to 0 for the processes that have the empty datatype is working, but this doesn't seem to me like the correct way to go.

I have managed to reproduce this behavior in a minimal C example (2 processes, simple contiguous datatype and integers) which I'm pasting here. I should also note that the same example seems to be running normally using Open MPI in a Linux machine. Any insight would be greatly appreciated.

Thanks a lot in advance!

#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
  MPI_Init(&argc, &argv);

  int comm_rank, comm_size;
  MPI_Comm_rank(MPI_COMM_WORLD, &comm_rank);
  MPI_Comm_size(MPI_COMM_WORLD, &comm_size);
  if (comm_size != 2) {
    printf("Must be run with 2 processors!\n");
    MPI_Finalize();
    return 0;
  }

  int sdata[] = {1};

  // How many elements will be sent by each process (rank 0: 1 element, rank 1: 0 elements) and received by rank 0
  int srcounts[] = {1, 0};

  MPI_Datatype newtype;
  MPI_Type_contiguous(srcounts[comm_rank], MPI_INT, &newtype);
  MPI_Type_commit(&newtype);

  int rdispls[] = {0, 1}, rdata[] = {-1, -1};
  MPI_Gatherv(sdata, 1, newtype, rdata, srcounts, rdispls, MPI_INT, 0, MPI_COMM_WORLD);

  // Substituting the above MPI_Gatherv with the following, should result in a proper execution.
  // if (comm_rank == 0) {
  //   MPI_Gatherv(sdata, 1, newtype, rdata, srcounts, rdispls, MPI_INT, 0, MPI_COMM_WORLD);
  // } else {
  //   MPI_Gatherv(sdata, 0, newtype, rdata, srcounts, rdispls, MPI_INT, 0, MPI_COMM_WORLD);
  // }

  MPI_Type_free(&newtype);

  if (comm_rank == 0) {
    for (int i = 0; i < 2; ++i) {
      printf("rdata[%d]=%d\n", i, rdata[i]);
    }
  }
  MPI_Finalize();
  return 0;
}
0 Kudos
1 Solution
PrasanthD_intel
Moderator
1,174 Views

Hi George,


Sorry about the late reply.

As from the excerpt you have mentioned from MPI standard it can be implied that the sendtype and recvtype should be equal. Which are valid in your code.

But there is no mention about the send count for empty datatype

According to the Standard, the definition of sendcount is number of elements in send buffer (non-negative integer)

So in this case we are sending Empty datatype so sendcount has to be 0.

We suggest you to use derived datatype based on the rank for sendcount so there is no need to use if clause for each rank.


Regards

Prasanth


View solution in original post

0 Kudos
6 Replies
PrasanthD_intel
Moderator
1,269 Views

Hi George,


Thanks for reporting the issue to us.

We ran the reproducer code at our end. We observed that it hangs when the send count in MPI_GatherV is set to 1 in rank 1.

We then ran the reproducer code with ITAC(Intel trace analyzer and collector) which checks the correctness of MPI code.

The ITAC reported that there is a GLOBAL:COLLECTIVE:SIZE_MISMATCH at local rank 1.

Below are the details of the ITAC output:


$ mpiicc -check_mpi gather.c -o out

$ mpirun -np 2 ./out

...

for rank 0 srcounts=1 , rdispls = 0 /nfor rank 1 srcounts=0 , rdispls = 1

[0] ERROR: GLOBAL:COLLECTIVE:SIZE_MISMATCH: error

[0] ERROR:  Mismatch found in local rank [1] (global rank [1]),

[0] ERROR:  other processes may also be affected.

[0] ERROR:  No problem found in local rank [0] (same as global rank):

[0] ERROR:    MPI_Gatherv(*sendbuf=0x7ffd46deaf98, sendcount=1, sendtype=MPI_INT, *recvbuf=0x7ffd46deaf90, *recvcounts=0x7ffd46deaf80, *displs=0x7ffd46deaf88, recvtype=MPI_INT, root=0, comm=MPI_COMM_WORLD)

[0] ERROR:    main (/home/u29999/mpi/out)

[0] ERROR:    __libc_start_main (/lib/x86_64-linux-gnu/libc-2.27.so)

[0] ERROR:    _start (/home/u29999/mpi/out)

[0] ERROR:  Root expects 0 items but 1 sent by local rank [1] (same as global rank):

[0] ERROR:    MPI_Gatherv(*sendbuf=0x7ffdce945518, sendcount=1, sendtype=MPI_INT, *recvbuf=0x7ffdce945510, *recvcounts=0x7ffdce945500, *displs=0x7ffdce945508, recvtype=MPI_INT, root=0, comm=MPI_COMM_WORLD)

[0] ERROR:    main (/home/u29999/mpi/out)

[0] ERROR:    __libc_start_main (/lib/x86_64-linux-gnu/libc-2.27.so)

[0] ERROR:    _start (/home/u29999/mpi/out)

[0] INFO: 1 error, limit CHECK-MAX-ERRORS reached => aborting

[0] WARNING: starting premature shutdown


[0] INFO: GLOBAL:COLLECTIVE:SIZE_MISMATCH: found 1 time (1 error + 0 warnings), 0 reports were suppressed

[0] INFO: Found 1 problem (1 error + 0 warnings), 0 reports were suppressed.


Here you can observe that root expects 0 items, but the Gather is sending 1 item. 

We think the reason that the MPI program is hanging since it is expecting for 1 item from local rank 1 and not getting.


We are investigating this further and will get back to you soon.


Regards

Prasanth


gkarpa
Beginner
1,233 Views

Hi Prasanth,

Thanks a lot for reaching out and for the timely response!

Indeed, there is a count mismatch between the sender and the receiver. However, I think that this should not be a problem since the derived datatype itself has no data inside. So, the amount of bytes on both sides do match (Rank 1 sends 1 contiguous datatype that consists of 0 MPI_INTs, and rank 0 expects 0 MPI_INTs from rank 1).

This seems to be in accordance with the MPI 3.1 standard specification (page 152) which states that

The type signature implied by sendcount, sendtype on process i must be equal to the type signature implied by recvcounts[i], recvtype at the root.
This implies that the amount of data sent must be equal to the amount of data received, pairwise between each process and the root.
Distinct type maps between sender and receiver are still allowed, as illustrated in Example 5.6.

Please feel free to correct me if I have misunderstood something. Thanks in advance, I'm looking forward to your investigation's results.

Regards,

George

0 Kudos
PrasanthD_intel
Moderator
1,175 Views

Hi George,


Sorry about the late reply.

As from the excerpt you have mentioned from MPI standard it can be implied that the sendtype and recvtype should be equal. Which are valid in your code.

But there is no mention about the send count for empty datatype

According to the Standard, the definition of sendcount is number of elements in send buffer (non-negative integer)

So in this case we are sending Empty datatype so sendcount has to be 0.

We suggest you to use derived datatype based on the rank for sendcount so there is no need to use if clause for each rank.


Regards

Prasanth


0 Kudos
PrasanthD_intel
Moderator
1,148 Views

Hi George,


Are you satisfied with the given explanation?

Reach out to us if you still have any doubts regarding the specification.


Regards

Prasanth


0 Kudos
gkarpa
Beginner
1,135 Views

Hi Prasanth,

Yes, I get your reasoning and it makes sense. I will update the sendcounts in the code accordingly. Thanks again for your time and explanations.

Regards,

George

0 Kudos
PrasanthD_intel
Moderator
979 Views

Hi George,


Thanks for the confirmation.

We are closing this issue and will no longer respond to this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only


Regards

Prasanth


0 Kudos
Reply