I recently tried the APS tool to capture the message details (size and amount) for the WRF application on the intel 8280 (opa).
We launch 1 process per core, and here is the data -
It seems that significant amount of time is spent in the transfer of these 0 byte messages and with more number of nodes, the amount of messages increases. Could you please help me in understanding following-
Q: The significance of these 0 bytes messages and How are they related to MPI communication protocol?
I guess aps collects messages transferred between all processes (inter node + intra node), so
Q:Is there a way to check (from aps) that how much of these messages were transmitted to the network? (inter node messages - for 2 and 4 node runs)
Messages to a target include a Tag in addition to data (if any). Thus you can pass status (information) via Tag as opposed to in the data blob. If I were to guess, I suspect that this is the cause of 0-byte messages. A guessed-at example might be a SYNC or Watchdog tickle, but the user application can do this as well.
Actually APS just shows correct message sizes for senders only. For receiving ranks APS currently shows 0 byte messages. Engineering is working on a fix.