<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Incorrect program or MPI implementation bug? in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055989#M4451</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Below is a simple reproduction case for the issue we're facing:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#include "stdio.h"
#include "mpi.h"
#include "stdlib.h"

int main(int argc, char* argv[]) {
    int rank;
    MPI_Group group;

    MPI_Init(&amp;amp;argc, &amp;amp;argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &amp;amp;rank);
    MPI_Comm_group(MPI_COMM_WORLD, &amp;amp;group);

    if (rank == 0) {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("rank 0: about to send\n");
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MPI_Ssend(NULL, 0, MPI_INT, 1, 0, MPI_COMM_WORLD);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("rank 0: send completed\n");
    } else {
        MPI_Request req[2];
        int which;

        MPI_Isend(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &amp;amp;req[0]);
        MPI_Irecv(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &amp;amp;req[1]);

        MPI_Waitany(2, req, &amp;amp;which, MPI_STATUS_IGNORE);

        if (which == 0) {
            printf("rank 1: send succeeded; cancelling receive request\n");
            MPI_Cancel(&amp;amp;req[1]);
            MPI_Wait(&amp;amp;req[1], MPI_STATUS_IGNORE);
        } else {
            printf("rank 1: receive succeeded; cancelling send request\n");
            MPI_Cancel(&amp;amp;req[0]);
            MPI_Wait(&amp;amp;req[0], MPI_STATUS_IGNORE);
        }
    }

    MPI_Finalize();
    return 0;
}
&lt;/PRE&gt;

&lt;P&gt;This program outputs the following, after which it hangs indefinitely:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;rank 0: about to send
rank 1: send succeeded; cancelling receive request
&lt;/PRE&gt;

&lt;P&gt;I understand that this is caused by the "eager completion" of MPI_Isend() on rank 1. Also, I understand that the expected behaviour of a program that initiates an unmatched operation is undefined. However, I don't believe this is the case here, as I do eventually call MPI_Cancel() on the request. If that was not enough, then wouldn't that imply that a program that simply does MPI_Isend(...); MPI_Cancel(...); MPI_Wait(...); is also incorrect?&lt;/P&gt;

&lt;P&gt;I also noticed that changing the MPI_Isend() into MPI_Issend() makes the program work as expected:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;rank 0: about to send
rank 0: send completed
rank 1: receive succeeded; cancelling send request
&lt;/PRE&gt;

&lt;P&gt;So, to keep it short, my questions are:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;Is the initial (MPI_Isend()) version of my program an incorrect MPI program, whose behaviour is undefined?&lt;/LI&gt;
	&lt;LI&gt;If so, then could you please explain why and point me to the relevant section of the MPI standard or any other resources that would clarify these matters for me?&lt;/LI&gt;
	&lt;LI&gt;Is the MPI_Issend() version of my program also incorrect?&lt;/LI&gt;
	&lt;LI&gt;If MPI_Issend() still doesn't make the program correct, can I at least be sure that, with the Intel implementation, it will always work as expected? Or is it just a coincidence that it does?&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Many thanks to anyone willing to help me with this!&lt;/P&gt;

&lt;P&gt;- Adrian&lt;/P&gt;</description>
    <pubDate>Sun, 06 Jul 2014 17:25:21 GMT</pubDate>
    <dc:creator>Adrian_I_</dc:creator>
    <dc:date>2014-07-06T17:25:21Z</dc:date>
    <item>
      <title>Incorrect program or MPI implementation bug?</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055989#M4451</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;

&lt;P&gt;Below is a simple reproduction case for the issue we're facing:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;#include "stdio.h"
#include "mpi.h"
#include "stdlib.h"

int main(int argc, char* argv[]) {
    int rank;
    MPI_Group group;

    MPI_Init(&amp;amp;argc, &amp;amp;argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &amp;amp;rank);
    MPI_Comm_group(MPI_COMM_WORLD, &amp;amp;group);

    if (rank == 0) {
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("rank 0: about to send\n");
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; MPI_Ssend(NULL, 0, MPI_INT, 1, 0, MPI_COMM_WORLD);
&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; printf("rank 0: send completed\n");
    } else {
        MPI_Request req[2];
        int which;

        MPI_Isend(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &amp;amp;req[0]);
        MPI_Irecv(NULL, 0, MPI_INT, 0, 0, MPI_COMM_WORLD, &amp;amp;req[1]);

        MPI_Waitany(2, req, &amp;amp;which, MPI_STATUS_IGNORE);

        if (which == 0) {
            printf("rank 1: send succeeded; cancelling receive request\n");
            MPI_Cancel(&amp;amp;req[1]);
            MPI_Wait(&amp;amp;req[1], MPI_STATUS_IGNORE);
        } else {
            printf("rank 1: receive succeeded; cancelling send request\n");
            MPI_Cancel(&amp;amp;req[0]);
            MPI_Wait(&amp;amp;req[0], MPI_STATUS_IGNORE);
        }
    }

    MPI_Finalize();
    return 0;
}
&lt;/PRE&gt;

&lt;P&gt;This program outputs the following, after which it hangs indefinitely:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;rank 0: about to send
rank 1: send succeeded; cancelling receive request
&lt;/PRE&gt;

&lt;P&gt;I understand that this is caused by the "eager completion" of MPI_Isend() on rank 1. Also, I understand that the expected behaviour of a program that initiates an unmatched operation is undefined. However, I don't believe this is the case here, as I do eventually call MPI_Cancel() on the request. If that was not enough, then wouldn't that imply that a program that simply does MPI_Isend(...); MPI_Cancel(...); MPI_Wait(...); is also incorrect?&lt;/P&gt;

&lt;P&gt;I also noticed that changing the MPI_Isend() into MPI_Issend() makes the program work as expected:&lt;/P&gt;

&lt;PRE class="brush:plain;"&gt;rank 0: about to send
rank 0: send completed
rank 1: receive succeeded; cancelling send request
&lt;/PRE&gt;

&lt;P&gt;So, to keep it short, my questions are:&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;Is the initial (MPI_Isend()) version of my program an incorrect MPI program, whose behaviour is undefined?&lt;/LI&gt;
	&lt;LI&gt;If so, then could you please explain why and point me to the relevant section of the MPI standard or any other resources that would clarify these matters for me?&lt;/LI&gt;
	&lt;LI&gt;Is the MPI_Issend() version of my program also incorrect?&lt;/LI&gt;
	&lt;LI&gt;If MPI_Issend() still doesn't make the program correct, can I at least be sure that, with the Intel implementation, it will always work as expected? Or is it just a coincidence that it does?&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;Many thanks to anyone willing to help me with this!&lt;/P&gt;

&lt;P&gt;- Adrian&lt;/P&gt;</description>
      <pubDate>Sun, 06 Jul 2014 17:25:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055989#M4451</guid>
      <dc:creator>Adrian_I_</dc:creator>
      <dc:date>2014-07-06T17:25:21Z</dc:date>
    </item>
    <item>
      <title>Hi Adrian,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055990#M4452</link>
      <description>&lt;P&gt;Hi Adrian,&lt;/P&gt;

&lt;P&gt;The MPI_Issend version is a correct program.&amp;nbsp; The original MPI_Isend version is incorrect, dependent on the implementation details.&amp;nbsp; From the MPI standard (see section 3.4 Communication Modes for full details), here's what's going on.&amp;nbsp; There are four communication modes that can be used by a send.&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;Standard.&amp;nbsp; MPI_Send, MPI_Isend.&amp;nbsp; This lets the implementation decide which of the other three modes will be used.&amp;nbsp; I'll need to confirm with our developers, but in every instance I've watched it, the Intel® MPI Library has chosen Buffered.&lt;/LI&gt;
	&lt;LI&gt;Buffered.&amp;nbsp; MPI_Bsend, MPI_Ibsend.&amp;nbsp; In this mode, the send is allowed to start at any time, and completes as soon as the data is sent to a buffer.&lt;/LI&gt;
	&lt;LI&gt;Synchronous.&amp;nbsp; MPI_Ssend, MPI_Issend.&amp;nbsp; In this mode, the send is allowed to start at any time, but cannot complete until the matching receive is posted.&lt;/LI&gt;
	&lt;LI&gt;Ready.&amp;nbsp; MPI_Rsend, MPI_Irsend.&amp;nbsp; In this mode, the send should not be started before the matching receive is posted, otherwise the program is incorrect.&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;Specifically for your program, when you use MPI_Isend, it is able to complete as soon as the data is in a buffer.&amp;nbsp; Thus, the MPI_Waitany can have either call complete, and the MPI_Isend is the first one detected to be complete.&amp;nbsp; In this case, the MPI_Irecv is cancelled, the MPI_Ssend can never complete, the data from the MPI_Isend is lost in the buffer, and the program hangs.&lt;/P&gt;

&lt;P&gt;When you switch to MPI_Issend, this call now requires the matching receive before it can complete.&amp;nbsp; Thus, MPI_Waitany will always find the MPI_Irecv and complete it, cancelling the MPI_Issend, completing the MPI_Ssend, and allowing the program to finish.&lt;/P&gt;

&lt;P&gt;Does this make sense?&lt;/P&gt;

&lt;P&gt;James.&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jul 2014 14:30:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055990#M4452</guid>
      <dc:creator>James_T_Intel</dc:creator>
      <dc:date>2014-07-08T14:30:47Z</dc:date>
    </item>
    <item>
      <title>Hi James,</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055991#M4453</link>
      <description>&lt;P&gt;Hi James,&lt;/P&gt;

&lt;P&gt;It makes total sense - in fact, it's just what I thought. Still, it's great to have the confirmation that, with MPI_Issend, the program is correct.&lt;BR /&gt;
	Many thanks for your detailed answer!&lt;/P&gt;

&lt;P&gt;Regards,&lt;BR /&gt;
	Adrian&lt;/P&gt;</description>
      <pubDate>Tue, 08 Jul 2014 15:31:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/Incorrect-program-or-MPI-implementation-bug/m-p/1055991#M4453</guid>
      <dc:creator>Adrian_I_</dc:creator>
      <dc:date>2014-07-08T15:31:05Z</dc:date>
    </item>
  </channel>
</rss>

