Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

simple MPI code generating deadlock

conor_p_
Beginner
680 Views

Hello everyone,

I hope this is the appropriate forum for this question. I have recently started learning MPI, and can't seem to figure out why the following codes generating deadlock which occurs is subroutine try_comm. I compiled and ran as follows

mpiifort global.f90 try.f90 new.f90 -o new.out

mpirun -n 2 ./new.out

my output:

hello from rank 0 2

entering 0 0

waiting for receive from rank 1  1

entering 1 0

after send

finished 1 0
module global

  implicit none
  integer :: size,rank
  integer,allocatable :: p2n(:),n2p(:)
end module global
module try
  use mpi
  use global
  implicit none
contains
  
  subroutine try_comm(loop,n0)
    implicit none
    integer :: n0,loop
    integer :: ierr,msgtag
    integer :: n0temp
    integer :: status(MPI_STATUS_SIZE)
    

    call mpi_barrier(MPI_COMM_WORLD,ierr)

    print*,'entering',rank,loop
    if(rank.ne.0)then
       call mpi_send(n0,1,mpi_int,0,msgtag,MPI_COMM_WORLD,ierr)
       print*,'after send'
    endif
    
    if(rank.eq.0)then
       do loop = 1,size-1
          print*,'waiting for receive from rank',loop,size-1
          call mpi_recv(n0temp,1,mpi_int,loop,msgtag,MPI_COMM_WORLD,status,ierr)
          n0 = n0temp
          print*,'received 0:',n0
       enddo
       
    endif

    print*,'finished',rank,loop

    call mpi_barrier(MPI_COMM_WORLD,ierr)
    
  end subroutine try_comm
  
end module try
program new
  use mpi
  use global
  use try
  implicit none
  integer :: n0
  integer :: ierr,msgtag
  integer :: loop
  integer :: n0temp
  integer :: status(MPI_STATUS_SIZE)
  
  call mpi_init(ierr)
  call mpi_comm_size(MPI_COMM_WORLD,size,ierr)
  call mpi_comm_rank(MPI_COMM_WORLD,rank,ierr)
  
  
  if(rank.eq.0)then
     print*,'hello from rank',rank,size
     allocate(p2n(0:size-1),n2p(0:size-1))
  endif
  
  n0 = rank*10

 
  do loop = 1,20
     call try_comm(loop,n0)
  enddo

  call mpi_finalize(ierr)

  stop
end program new

 

However, if I change new.f90 to the following where now the subroutine try_comm is included in the do loop as follows, I do not get deadlock.

program new
  use mpi
  use global
  use try
  implicit none
  integer :: n0
  integer :: ierr,msgtag
  integer :: loop,loopa
  integer :: n0temp
  integer :: status(MPI_STATUS_SIZE)
  
  call mpi_init(ierr)
  call mpi_comm_size(MPI_COMM_WORLD,size,ierr)
  call mpi_comm_rank(MPI_COMM_WORLD,rank,ierr)
  
  
  if(rank.eq.0)then
     print*,'hello from rank',rank,size
     allocate(p2n(0:size-1),n2p(0:size-1))
  endif
  
  n0 = rank*10


  do loopa = 1,20
     !call try_comm(loop,n0)

     call mpi_barrier(MPI_COMM_WORLD,ierr)
     
     print*,'entering',rank,loopa
     if(rank.ne.0)then
        call mpi_send(n0,1,mpi_int,0,msgtag,MPI_COMM_WORLD,ierr)
        print*,'after send'
     endif
     
     if(rank.eq.0)then
        do loop = 1,size-1
           print*,'waiting for receive from rank',loop,size-1
           call mpi_recv(n0temp,1,mpi_int,loop,msgtag,MPI_COMM_WORLD,status,ierr)           
           n0 = n0temp
           print*,'received 0:',n0
        enddo
        
     endif
     
     print*,'finished',rank,loopa
     call mpi_barrier(MPI_COMM_WORLD,ierr)

     
  enddo


  call mpi_finalize(ierr)

  stop
end program new

 

0 Kudos
3 Replies
Michael_R_2
Beginner
680 Views

I see that you use an uninitialized value for  msgtag . It is input for MPI_SEND and MPI_SEND.

Thus the actual value for  msgtag  is random, and might be different for the 2 code-variants. What happens if it is a NaN ?

So make sure that you do not use uninitialized values.

By the way, I would not use an variable named   size  , because   size  is also  an intrinsic fct in Fortran.

Greetings Michael R.

0 Kudos
diedro
Beginner
680 Views

dear Conor,

I suggest to use only two CPU, one to send and one to receive. It is easier to catch the error.

Cheers,

Diego

 

 

0 Kudos
conor_p_
Beginner
680 Views

thanks guys! that first response actually answered the question for me. For some reason my understanding after reading a couple MPI examples was that MPI handled msgtag behind the scenes, but evidently that is very,very wrong. thanks again.

0 Kudos
Reply