Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

MPI_Recv block a long time

wang_b_1
Beginner
449 Views

hello:
    I get into trouble when use MPI_Recv in my programmes. 
    My programme start 3 subprocess,and bind them to cpu 1-3 respectively. In each subprocess, first disabled interrupts , then send message to other process and receive from others. Repeat it a billion times.
    I except that MPI_Recv will return in a fixed times ,and not use MPI_irecv instead.
    In order to do that, i disabled interrupts and cancel ticks on cpu1-3,remove other process from cpu 1-3 to cpu 0,and bind interrupts to cpu0.
    But I found that a very few times(about a billion times may happen 1 times) the MPI_Recv will block for more than 600 ms, but normally the MPI_Recv only take less than 10ms.
    I don't know why the  MPI_Recv some times block so long,is there any method to find the reason and solve the problem?


   Use mpirun -n 3 to execute the programmes, use Hydra and Shared memory fabrics.
   environment :    parallel_studio_xe_2015_update2  linux 3.10 
=====  Processor composition  =====
Processor name    : Intel(R) Core(TM) i5-4590  
Packages(sockets) : 1
Cores             : 4
Processors(CPUs)  : 4
Cores per package : 4
Threads per core  : 1

void emt_comm()
{
    ......
    for (i=0; i<ProcInfo.NumProc; i++)
    {
        if (i != ProcInfo.Id)
            MPI_Send(SendBuf, EMT_COMM_NUM, MPI_DOUBLE_PRECISION, i, i+1, ProcInfo.CommEMTCAL);
    }

    for (i=0; i<ProcInfo.NumProc; i++)
    {
        if (i != ProcInfo.Id)
            MPI_Recv(buf22, EMT_COMM_NUM, MPI_DOUBLE_PRECISION, i, ProcInfo.Id+1, ProcInfo.CommEMTCAL, &MpiStt);    
    }
}
    
    
    

void *thread_emt(__attribute__((unused)) void *arg)
{
    ......
    set_thread_affinity(core_id);
    MPI_Barrier(ProcInfo.CommCAL);
    disabled_inter();
    for(step=1; step<=10000000; step++)
    {
        emt_comm();    
        MPI_Barrier(ProcInfo.CommCAL);    
    }
    open_inter();
}
    
    
int main(int argc,char *argv[])
{
    ......
    isCalculationOver = 0;
    set_thread_affinity(0);
    MPI_Init(&argc, &argv);
    MPI_Comm_rank( MPI_COMM_WORLD, &ProcInfo.Id);
    MPI_Comm_size( MPI_COMM_WORLD, &ProcInfo.NumProc);
    core = ProcInfo.Id+1;
    MPI_Barrier(MPI_COMM_WORLD);
    ......
    pthread_create(&thread, NULL, thread_emt, &core);
    ......
    while(1 != isCalculationOver)
        usleep(100*1000);

    MPI_Barrier(MPI_COMM_WORLD);
    MPI_Finalize();

    return 0;
}


 

0 Kudos
2 Replies
Victor_E_1
Beginner
449 Views

Your code is semantically deadlocking: you start by everyone doing a send. Since this is a blocking send, no one progresses to the receives after, and you have deadlock.

The reason your code will actually work is that small messages are usually sent without going through the rendez-vous protocol. However, this is up to the mercy of the network and availability of buffer space on the NIC. Since you're repeating this code many times, I imagine that  every once in a while this "eager" send does not succeed.

 

Victor.

 

0 Kudos
James_T_Intel
Moderator
449 Views

To clarify more on what Victor said, there are multiple ways to handle MPI_Send (and MPI_Recv).  That is up to the implementation.  I would highly recommend switching to an explicitly non-blocking MPI_Isend instead.

Also, are you intending to use multiple threads for MPI calls?  If so, you should use MPI_Init_thread instead of MPI_Init.

0 Kudos
Reply