Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.
2275 Discussions

Running MPI jobs from inside Singularity container with Intel MPI 2019.5

Ball__Keith
Beginner
2,681 Views

Hi All,

As per the recent webinar introducing new Intel MPI 2019 update 5 features, it is now in theory possible to include Intel MPI libaries, and call mpirun for a multi-node MPI job entirely inside a Singularity container, with no need to have Intel MPI installed outside the container. So instead of launching an MPI job in a container using an external MPI stack, like so:

     mpirun -n <nprocs> -perhost <procs_per_node> -hosts <hostlist> singularity exec <container_name> <path_to_executable_inside_container>

one should now be able to do:

    singularity exec <container_name> mpirun -n <nprocs> -perhost <procs_per_node> -hosts <hostlist> <path_to_executable_inside_container>

I have the Intel MPI 2019.5 libraries (as well as Intel run-time libraries for C++), plus libfabric, inside my container, along with sourcing the following in the container:

cat /.singularity.d/env/90-environment.sh 
#!/bin/sh
# Custom environment shell code should follow
    source /opt/intel/bin/compilervars.sh intel64
    source /opt/intel/impi/2019.5.281/intel64/bin/mpivars.sh -ofi_internal=1 release

This is not working so far. Below I illustrate with a simple test, and run from inside the container (shell mode), and get the following error messages after about 20-30 seconds of the command just hanging with no output:

Singularity image.sif:~/singularity/fv3-upp-apps> export I_MPI_DEBUG=500
Singularity image.sif:~/singularity/fv3-upp-apps> export FI_PROVIDER=verbs
Singularity image.sif:~/singularity/fv3-upp-apps> export FI_VERBS_IFACE="ib0"
Singularity image.sif:~/singularity/fv3-upp-apps> export I_MPI_FABRICS=shm:ofi
Singularity image.sif:~/singularity/fv3-upp-apps> mpirun -n 78 -perhost 20 -hosts appro07,appro08,appro09,appro10 hostname 
[mpiexec@appro07.internal.redlineperf.com] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:114): unable to run proxy on appro07 (pid 109898)
[mpiexec@appro07.internal.redlineperf.com] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:152): check exit codes error
[mpiexec@appro07.internal.redlineperf.com] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:205): poll for event error
[mpiexec@appro07.internal.redlineperf.com] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:731): error waiting for event
[mpiexec@appro07.internal.redlineperf.com] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1919): error setting up the boostrap proxies

I also tried just calling mpirun using just one host (and only enough processes that fit on one host), with the same result.

Is there a specific list of dependencies (e.g. do I need openssh-clients installed?) to use this all-inside-the-container approach? I do not see anything within the Intel MPI 2019 upsate 5 Developer Reference about running with Singularity containers.

 

Thanks, Keith

0 Kudos
0 Replies
Reply