pmpi_f08 doesn't support mpi_f08 interface


My team is working on some code that makes use of the mpi_f08 Fortran interface, and needs to profile it to identify where to prioritise optimisation. Since ITAC produces no output when run on code compiled with `use mpi_f08`, it is necessary to `use pmpi_f08` instead. However, the Intel implementation of this seems to be broken, not providing any of the `mpi_f08` functionality.

For example, this noop program segfaults:

program test_pmpi_f08
  use pmpi_f08
  implicit none

  call MPI_Init

  call MPI_Finalize

end program test_pmpi_f08

But adding in the supposedly optional `ierr` argument to `MPI_Init` allows it to run, as does swapping `pmpi_f08` for `mpi_f08`

This second program gives the wrong result for the reduction, and then segfaults:

program test_pmpi_f08_collectives
  use pmpi_f08
  implicit none

  integer :: ierr, i, ip, np

  call MPI_Init(ierr)
  call MPI_Comm_Rank(MPI_Comm_World, ip, ierr)
  call MPI_Comm_Size(MPI_Comm_World, np, ierr)

  i = 1
  call MPI_AllReduce(MPI_In_Place, i, 1, MPI_Integer, MPI_Sum, MPI_Comm_World, ierr)

  if (ip .eq. 0) then
     if (i .eq. np) then
        print *, "Reduction succeeded"
        print *, "FAILED:", i, "!=", np
     end if
  end if

  call MPI_Finalize

end program test_pmpi_f08_collectives

As the last example, this runs fine if `mpi_f08` is used instead of `pmpi_f08`. Removing the `MPI_In_Place` and instead using a separate buffer also works even with `pmpi_f08`.

This looks like a bug—as if the `pmpi_f08` implementation falls back on the `use mpi` interface rather than the `mpi_f08` one. Obviously if we have to rewrite all of our code in `use mpi` style in order to profile it, there's little point in writing it with `mpi_f08` syntax to begin with. Has anyone else encountered this?



