Intel® Fortran Compiler
Build applications that can scale for the future with optimized code designed for Intel® Xeon® and compatible processors.
Announcements
27549 Discussions

## MPI runtime error on repeated "form team" statements

Beginner
1,216 Views

I am trying to distribute coarray images across teams given a condition that is changing at runtime. To do this, I invoke "form team" several times - however, this results in an mpi runtime error after about 200-2000 invocations, depending on the exact configuration.

My best guess is that there is a communicator leak involved here. I have included a minimal working example, which obviously requires coarray support to be enabled. Maybe this is a coding error and there is a better way to switch images across teams?

I have also attached the output from running the below example with and without the "change team" block.

program fortran_team_debug

use iso_fortran_env, only: input_unit, output_unit, error_unit, team_type

integer, parameter :: team_1 = 1
integer, parameter :: team_2 = 2
type(team_type) :: main_team
integer :: team_num
integer :: step = 0
integer :: nstep = 1E8

! Ensure differing random seeds across images
call RANDOM_SEED(put = [2390598 + this_image()])

! Main loop
do step = 1, nstep
! Team number is assigned according to condition evaluated at runtime
team_num = assign_team()
form team(team_num, main_team)
change team(main_team)
! Output number of images in each team
if(this_image() == 1) then
write (*, '(a)', advance = 'no') 'Team '
write (*, '(i0)', advance = 'no') team_number()
write (*, '(a)', advance = 'no') ' containing '
write (*, '(i0)', advance = 'no') num_images()
write (*, '(a)', advance = 'no') ' images at step '
write (*, '(i0)', advance = 'yes') step
end if
end team
end do

contains

! Randomly assign images to team 1 or 2
integer function assign_team()
real :: rand
call RANDOM_NUMBER(rand)
if(rand > 0.5) then
assign_team = team_1
else
assign_team = team_2
end if
end function
end program fortran_team_debug

Labels (2)

• ### Runtime error

9 Replies
Moderator
1,136 Views

What version of the compiler are you using?

What is the OS?

Beginner
1,108 Views

Thanks for your reply. I've tried it on two different machines, both using ifort version 2021.5.0 Build 20211109_000000:

• Windows 10 (10.0.19044) on an Intel Core i9-9900k
• CentOS Linux (kernel-3.10.0-1127.8.2.el7.x86_64) on 2 Intel Xeon Gold 6240

The output above is from the Windows machine, but it's practically identical for the Linux machine. Please let me know if you need any more information.

Moderator
1,104 Views

Thank you. Just to be sure I am duplicating your problem correctly I have some more questions.

What compiler options are you using?

How many images?

What did you set to get that MPI output?

Beginner
1,082 Views

Here are the compiler flags, using 8 images for the Windows machine (pretty much the default x64 Debug profile for Visual Studio 2019 with coarray options added):

/nologo /debug:full /Od /Qcoarray:shared /Qcoarray-config-file:"mpi_config.txt" /Qcoarray-num-images:8 /warn:interfaces /module:"x64\Debug\\" /object:"x64\Debug\\" /Fd"x64\Debug\vc160.pdb" /traceback /check:bounds /check:stack /libs:dll /threads /dbglibs /c

And here is the MPI configuration file, mpi_config.txt:

-genvall -genv I_MPI_DEBUG=5 -genv I_MPI_FABRICS=shm -genv I_MPI_SILENT_ABORT=0 -genv I_MPI_FAULT_CONTINUE=0 C:\path\to\executable.exe

Thank you for looking into it.

Moderator
1,033 Views

Thanks for the additional information. I get the same failure. I filed a bug report, CMPLRLIBS-33803. I'll let you know when it's fixed.

Beginner
1,019 Views

Thank you very much! I'll make sure to accept it as a solution as soon as it's fixed.

Novice
984 Views

I don’t think it’s a bug with the underlying MPI or the compilers here. I am getting similar runtime failures with ifort, with OpenCoarrays/gfortran using Intel OneAPI MPI, as well as OpenCoarrays/gfortran using MPICH. The error messages from using MPICH are more descriptive: ‘Too many communicators’.

Your code example is probably not how coarray teams should be applied, especially the loop. I will try to explain this briefly.

It is important to understand the underlying APGAS model. Coarray Fortran does implement the PGAS model at two levels, SPMD and APGAS. With FORM/CHANGE TEAM we control the execution flow and data allocations at the APGAS level. The APGAS model is an extension of the PGAS model to allow for parallel programming on heterogeneous hardware with different types of accelerators. It was originally developed at IBM and elsewhere, but also with respect to Coarray Fortran:

In PGAS programming it is the programmer’s job to minimize the PGAS cost function. With respect to execution flow as well as allocations at the APGAS level, the programmer should minimize usage of FORM/CHANGE TEAM in favor of an execution flow (especially loops) at the SPMD level as much as possible:

program Main
use, intrinsic :: ISO_FORTRAN_ENV, only: team_type
implicit none

! enum for coarray team handling:
type :: TeamNumbers_EnumDef
! with 4 heterogeneous accelerators:
integer :: Nvidia_GPU = 1
integer :: Intel_GPU = 2
integer :: AMD_FPGA = 3
integer :: Intel_CSA = 4
integer :: RemainingImages = 5
end type TeamNumbers_EnumDef
! enum type:
type (TeamNumbers_EnumDef), parameter :: enum_TeamNumber &
= TeamNumbers_EnumDef ()

integer :: i_NumberOfTeams, i_NumberOfImagesPerTeam
integer :: i_UnusedImages, i_TeamNumber
type (team_type) :: BaseTeam

i_NumberOfTeams = 4
if (num_images() < 4) error stop
i_NumberOfImagesPerTeam = num_images() / i_NumberOfTeams
i_UnusedImages = mod(num_images(), i_NumberOfTeams) ! these images are not used

! split the available images into child teams:
if (this_image() <= i_NumberOfImagesPerTeam) then
i_TeamNumber = enum_TeamNumber % Nvidia_GPU
else if ((this_image() > i_NumberOfImagesPerTeam) .and. &
(this_image() <= (i_NumberOfImagesPerTeam * 2))) then
i_TeamNumber = enum_TeamNumber % Intel_GPU
else if ((this_image() > i_NumberOfImagesPerTeam * 2) .and. &
(this_image() <= (i_NumberOfImagesPerTeam * 3))) then
i_TeamNumber = enum_TeamNumber % AMD_FPGA
else if ((this_image() > i_NumberOfImagesPerTeam * 3) .and. &
(this_image() <= (i_NumberOfImagesPerTeam * 4))) then
i_TeamNumber = enum_TeamNumber % Intel_CSA
else
i_TeamNumber = enum_TeamNumber % RemainingImages
end if

form team (i_TeamNumber, BaseTeam)
change team (BaseTeam)
! APGAS level execution control:
BaseTeam_select: select case (team_number())
case (enum_TeamNumber % Nvidia_GPU)
if (this_image() == 1) write(*,*)'in team',team_number()
case (enum_TeamNumber % Intel_GPU)
if (this_image() == 1) write(*,*)'in team',team_number()
case (enum_TeamNumber % AMD_FPGA)
if (this_image() == 1) write(*,*)'in team',team_number()
case (enum_TeamNumber % Intel_CSA)
if (this_image() == 1) write(*,*)'in team',team_number()
case default
! unused images
if (this_image() == 1) write(*,*)'in team',team_number()
end select BaseTeam_select

end team

end program Main

Beginner
939 Views

Thank you for your explanation. My code example was meant as a quick and simple way to reproduce the error. As you say, I would also consider it to be bad practice to reform the teams in every iteration of the loop. In my actual application, a part of my calculations gets simpler as the program proceeds, thus requiring less images and leaving the remaining images to do other work (I use the teams mostly to restrict participation in collective subroutines). Think of it as trying to reform the teams every hour or so of a multi-day calculation.

However, I think you would agree that if I took the same example with repeatedly allocating and deallocating an allocatable variable in a loop like this, it should work even though it might not be good coding practice. I know little about MPI programming (which is why coarrays and teams are convenient for me), but as far as I know there is a way to free communicators no longer in use. Is it not possible to free/replace the communicators used by the teams?

Although this might not be a bug, I would prefer the compiler to throw a warning (if this is indeed a restriction of the current standard) or produce a more easily understandable error message at runtime.

Novice
875 Views

My mistake: ‘data allocations’ should be ‘coarray allocations’. More precisely, anything that involves collective blocking synchronization at the APGAS execution level (i.e. CHANGE TEAM, END TEAM, reallocating or newly allocate coarrays across teams) should be avoided as much as possible because this would pause/delay further execution on all involved accelerators. That means de facto to reduce the fork-join at the APGAS level (see the APGAS paper) as much as possible.

The (simple) solution is to reuse the already allocated coarrays (across teams but also within teams) for different tasks at the SPMD level and also to change tasks at the SPMD level without changing teams at the APGAS level.

I am working on this with some success already but it requires some effort for high performance: A new type of parallel programming model, a new type of non-blocking synchronization method, and a new type of channel-coroutine system that integrates both the SPMD and the APGAS levels, to allow for a reuse of the allocated coarrays and also to further reduce the PGAS cost function for the data transfers across teams (accelerators). Of course, it is still an early stage and I do only prepare for heterogeneous CAF programming yet.