Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Received data has zeros...

seongyun_k_
Beginner
346 Views

 

I have a few mega bytes of data and fixed size buffer (4MB).

So the sender thread iterates sending (MPI_Send) a fixed sized data at a time until it sends all the data.
The receiver knows how many bytes it will receive before beginning and iterates (MPI_Recv) using the same fixed sized buffer (4MB)

Sometimes the receiver receives data with all zeros only when it receives the last remaining data (possibly smaller than 4MB)

 - I checked that the sender sends the correct data. ('CheckContents' on the below code)
 - I checked that the receiver received the correct amount of bytes.

Sender(int bytes_to_send) {
   char* buffer = new char[4MB];
   while (;) {
      memcpy_from_some_other_buffer_into (buffer);
      CheckContents(buffer);
      MPI_Send(buffer, 4MB or remaining bytes);
      CheckContents(buffer);
   }
   delete [] buffer;
}

Receiver has the exactly same loop form except for the fact that it uses 'MPI_Recv'.

- 1. Is it possible that MPI_Send() still has the reference to the 'buffer' that I delete right after the last call?
- 2. Is there any method that I can use to debug this problem?

 

Can the following flags affect the program's correctness?
I_MPI_FABRICS
I_MPI_FALLBACK
MPICH_ASYNC_PROGRESS
I_MPI_PIN
I_MPI_DYNAMIC_CONNECTION

0 Kudos
1 Reply
Michael_Intel
Moderator
346 Views

Hello,

1. Is it possible that MPI_Send() still has the reference to the 'buffer' that I delete right after the last call?

No, the blocking send operation MPI_Send() might be treated in a similar way like non-blocking MPI_ISend() - but in this particular case (the so called Eager protocol), the MPI library will create a copy of the send buffer before returning control back to the user.

2. Is there any method that I can use to debug this problem?

Yes, in cases where your code violates the MPI standard or where the data transmission gets corrupted, the Intel® Trace Analyzer and Collector provides some correctness checking functionality. Please see the related reference manual for further information: https://software.intel.com/en-us/node/561293

Can the following flags affect the program's correctness?

No.

Best regards,

Michael

0 Kudos
Reply