Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2221 Discussions

Simple mpirun example SEGFAULTs using intel/oneapi-hpckit:2022.1.1-devel-ubuntu18.04

EugeneW403
Beginner
977 Views

I'm trying to run a simple Hello World MPI example inside the Docker container image and getting a SEGFAULT.

 

I'm using Docker container image:

intel/oneapi-hpckit:2022.1.1-devel-ubuntu18.04

https://hub.docker.com/r/intel/oneapi-hpckit

 

$> docker run -it --rm intel/oneapi-hpckit:2022.1.1-devel-ubuntu18.04

root@c1ea2a0c8961:/# cat <<EOF >hello.c

#include <mpi.h>

#include <stdio.h>

 

int main(int argc, char** argv) {

    // Initialize the MPI environment

    MPI_Init(NULL, NULL);

 

    // Get the number of processes

    int world_size;

    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

 

    // Get the rank of the process

    int world_rank;

    MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

 

    // Get the name of the processor

    char processor_name[MPI_MAX_PROCESSOR_NAME];

    int name_len;

    MPI_Get_processor_name(processor_name, &name_len);

 

    // Print off a hello world message

    printf("Hello world from processor %s, rank %d out of %d processors\n",

           processor_name, world_rank, world_size);

 

    // Finalize the MPI environment.

    MPI_Finalize();

}

EOF

 

root@c1ea2a0c8961:/# which mpicc

/opt/intel/oneapi/mpi/2021.5.0//bin/mpicc

 

root@c1ea2a0c8961:/# mpicc hello.c -o hello

root@c1ea2a0c8961:/# export I_MPI_DEBUG=5

root@c1ea2a0c8961:/# mpirun -n 1 ./hello
[0] MPI startup(): Intel(R) MPI Library, Version 2021.5 Build 20211102 (id: 9279b7d62)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.13.2rc1-impi
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:259:core:core:ofi_hmem_init():214<warn> Failed to initialize hmem iface FI_HMEM_ZE: Input/output error
libfabric:259:core:mr:ofi_default_cache_size():78<info> default cache size=2815754389
libfabric:259:core:core:ofi_register_provider():474<info> registering provider: tcp (113.20)
libfabric:259:core:core:ofi_register_provider():474<info> registering provider: sockets (113.20)
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ZE not supported
libfabric:259:core:core:ofi_register_provider():474<info> registering provider: shm (113.20)
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_CUDA not supported
libfabric:259:core:core:ofi_hmem_init():209<info> Hmem iface FI_HMEM_ROCR not supported

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 259 RUNNING AT c1ea2a0c8961
= KILLED BY SIGNAL: 11 (Segmentation fault)
===================================================================================

 

 

0 Kudos
1 Reply
EugeneW403
Beginner
949 Views

I was able to resolve this by setting I_MPI_FABRICS=shm and launching the container with increased shared memory:

docker run --shm-size=512m ...

 

Thank you!

0 Kudos
Reply