Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

New MPI error with Intel 2019.1, unable to run MPI hello world

Paul_K_2
Beginner
23,584 Views

After upgrading to update 1 of Intel 2019 we are not able to run even an MPI hello world example. This is new behavior and e.g. a spack installed gcc 8.20 and OpenMPI have no trouble on this system. This is a single workstation and only shm needs to work. For non-mpi use the compilers  work correctly. Presumably dependencies have changed slightly in this new update?

 

$ cat /etc/redhat-release
Red Hat Enterprise Linux Workstation release 7.5 (Maipo)
$ source /opt/intel2019/bin/compilervars.sh intel64
$ mpiicc -v
mpiicc for the Intel(R) MPI Library 2019 Update 1 for Linux*
Copyright 2003-2018, Intel Corporation.
icc version 19.0.1.144 (gcc version 4.8.5 compatibility)
$ cat mpi_hello_world.c
#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv) {
  // Initialize the MPI environment
  MPI_Init(NULL, NULL);

  // Get the number of processes
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);

  // Get the rank of the process
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

  // Get the name of the processor
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  // Print off a hello world message
  printf("Hello world from processor %s, rank %d out of %d processors\n",
	 processor_name, world_rank, world_size);

  // Finalize the MPI environment.
  MPI_Finalize();
}
$ mpiicc ./mpi_hello_world.c
$ ./a.out
Abort(1094543) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(639)......:
MPID_Init(860).............:
MPIDI_NM_mpi_init_hook(689): OFI addrinfo() failed (ofi_init.h:689:MPIDI_NM_mpi_init_hook:No data available)
$ export I_MPI_FABRICS=shm:ofi
$ export I_MPI_DEBUG=666
$ ./a.out
[0] MPI startup(): Imported environment partly inaccesible. Map=0 Info=0
[0] MPI startup(): libfabric version: 1.7.0a1-impi
Abort(1094543) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(639)......:
MPID_Init(860).............:
MPIDI_NM_mpi_init_hook(689): OFI addrinfo() failed (ofi_init.h:689:MPIDI_NM_mpi_init_hook:No data available)

 

0 Kudos
23 Replies
drMikeT
New Contributor I
2,062 Views

Hello,

 

I am encountering the same issues as those mentioned above, and the only way that IMPI 2019U5 works only when I set FI_PROVIDER=tcp

We are using ConnectX-4 EDR cards though and performance of MPI over EDR with FI_PROVIDER=tcp is unacceptable

How can we get Intel MPI 2019.X work on a RHEL7 system with Mellanox IB (Connectx-4 and above)?

Note:

  1. We have not defined IPoIB on the IB interfaces 
  2. $ ofed_info -s
    OFED-internal-4.0-2.0.0:
  3. Using FI_LOG_LEVEL=debug, I noticed 
    ...
libfabric:core:core:ofi_register_provider():391<info> registering provider: ofi_rxm (1.0)
libfabric:core:core:ofi_reg_dl_prov():528<warn> dlopen(/chv/houston/tcrome/vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/lib/prov/libpsmx2-fi.so): libpsm2.so.2: cannot open shared object file: No such file or directory
libfabric:core:core:ofi_reg_dl_prov():528<warn> dlopen(/chv/houston/tcrome/vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/lib/prov/libmlx-fi.so): /chv/houston/tcrome/vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries_2019.5.281/linux/mpi/intel64/libfabric/lib/prov/libmlx-fi.so: undefined symbol: ucm_global_opts

 


 

 

$ FI_PROVIDER=tcp  I_MPI_DEBUG=5 mpirun -hosts tc2022,tc2023 -np 2 -ppn 1  /vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries/linux/mpi/intel64/bin/IMB-MPI1
[0] MPI startup(): libfabric version: 1.7.2a-impi
[0] MPI startup(): libfabric provider: tcp;ofi_rxm
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       188113   tc2022     0
[0] MPI startup(): 1       168972   tc2023     0
[0] MPI startup(): I_MPI_ROOT=/chv/houston/tcrome/vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries_2019.5.281/linux/mpi
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_PIN_PROCESSOR_LIST=allcores:map=scatter
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_DEBUG=5
#------------------------------------------------------------
#    Intel(R) MPI Benchmarks 2019 Update 4, MPI-1 part
#------------------------------------------------------------
# Date                  : Fri Dec 13 10:11:04 2019
# Machine               : x86_64
# System                : Linux
# Release               : 3.10.0-514.2.2.el7.x86_64
# Version               : #1 SMP Tue Dec 6 23:06:41 UTC 2016
# MPI Version           : 3.1
# MPI Thread Environment:


# Calling sequence was:

# /vend/intel/parallel_studio_xe_2019_update5/compilers_and_libraries/linux/mpi/intel64/bin/IMB-MPI1

# Minimum message length in bytes:   0
# Maximum message length in bytes:   4194304
#
# MPI_Datatype                   :   MPI_BYTE
# MPI_Datatype for reductions    :   MPI_FLOAT
# MPI_Op                         :   MPI_SUM
#
#

# List of Benchmarks to run:

...

#---------------------------------------------------
# Benchmarking PingPong
# #processes = 2
#---------------------------------------------------
       #bytes #repetitions      t[usec]   Mbytes/sec
            0         1000        38.18         0.00
            1         1000        42.98         0.02
            2         1000        30.86         0.06
            4         1000        38.79         0.10
            8         1000        47.98         0.17
           16         1000        50.12         0.32
           32         1000        59.39         0.54
           64         1000        51.04         1.25
          128         1000        47.52         2.69
          256         1000        97.66         2.62
          512         1000        98.95         5.17
         1024         1000       124.67         8.21
         2048         1000        97.37        21.03
         4096         1000       109.87        37.28
         8192         1000       109.65        74.71
        16384         1000       167.28        97.94
        32768         1000       215.32       152.18
        65536          640       318.87       205.52
       131072          320       479.31       273.46
       262144          160       998.56       262.52
       524288           80      1766.08       296.87
      1048576           40      2856.11       367.13
      2097152           20      4871.59       430.49
      4194304           10      9875.57       424.72

 

0 Kudos
drMikeT
New Contributor I
2,062 Views

The question is if Intel MPI 2019.x can use infiniband and Mellanox gear. Since OPA is gone, I am expecting IB h/w to be fully supported. 

 

Regards

Michael

0 Kudos
Uwe_Wolfram
Beginner
2,062 Views
Can confirm the problem for MPI 2019.6 on a machine with E5-2670 (v1 sandy bridge) and ConnectX3 FDR running SLES12SP4. Jobs will run only if FI_PROVIDER is set. Regards Uwe
0 Kudos
Reply