Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!

Crash in MPI_COMM_SPAWN_MULTIPLE

Mark_C_
Beginner
88 Views

I trying to manually spawn MPI processes. This code works fine when compiled with gfortran and linked against the OpenMPI libraries but when I compile it using ifort and the Intel MPI libraries, it results in an internal crash within the MPI_COMM_SPAWN_MULTIPLE subroutine. The relevant code is

         ALLOCATE(commands(num_processes - 1))
         ALLOCATE(args(num_processes - 1,2))
         ALLOCATE(info(num_processes - 1))
         ALLOCATE(max_procs(num_processes - 1))
         ALLOCATE(error_array(num_processes - 1))

         commands = TRIM(context%cl_parser%command)

         args(:,1) = temp_string
         IF (num_threads .lt. 0) THEN
            args(:,2) = '-para=-1'
         ELSE IF (num_threads .lt. 10) THEN
            WRITE (temp_string, 1001) num_threads
            args(:,2) = TRIM(temp_string)
         ELSE
            WRITE (temp_string, 1002) num_threads
            args(:,2) = TRIM(temp_string)
         END IF

         max_procs = 1

         DO i = 2, num_processes
            CALL MPI_INFO_CREATE(info(i - 1), error)
            CALL MPI_INFO_SET(info(i - 1), "wdir", process_dir(i),             &
     &                        error)
         END DO

         CALL MPI_COMM_SPAWN_MULTIPLE(num_processes - 1, commands, args,       &
     &                                max_procs, info, 0,                      &
     &                                MPI_COMM_WORLD, child_comm,              &
     &                                error_array, error)

         CALL MPI_INTERCOMM_MERGE(child_comm, .false.,                         &
     &                            context%intra_comm, error)

         DO i = 1, num_processes - 1
            CALL MPI_INFO_FREE(info(i), error)
         END DO

         DEALLOCATE(info)
         DEALLOCATE(max_procs)
         DEALLOCATE(args)
         DEALLOCATE(commands)
         DEALLOCATE(error_array)

When run this results in a segmentation fault internal to MPI_COMM_SPAWN_MULTIPLE. With the following error

Program received signal SIGSEGV, Segmentation fault.
0x00002aaaaadeb304 in pmpi_comm_spawn_multiple__ () from /usr/local/intel/impi/3.2.0.011/lib64/libmpiif.so.3.2

Is there something that I'm doing wrong here to cause this?

0 Kudos
0 Replies
Reply