Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

Issue with MPI 2019U6 and MLX provider

Ade_F_
Beginner
6,281 Views

Hi

We have two clusters that are almost identical except that one is now running Mellanox OFED 4.6 and the other 4.5.

With MPI 2019U6 from Studio 2020 distribution, one cluster (4.5) works OK, the other (4.6) does not and throws some UCX errors:

]$ cat slurm-151351.out
I_MPI_F77=ifort
I_MPI_PORT_RANGE=60001:61000
I_MPI_F90=ifort
I_MPI_CC=icc
I_MPI_CXX=icpc
I_MPI_DEBUG=999
I_MPI_FC=ifort
I_MPI_HYDRA_BOOTSTRAP=slurm
I_MPI_ROOT=/apps/compilers/intel/2020.0/compilers_and_libraries_2020.0.166/linux/mpi
MPI startup(): Imported environment partly inaccesible. Map=0 Info=0
[0] MPI startup(): libfabric version: 1.9.0a1-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): detected mlx provider, set device name to "mlx"
[0] MPI startup(): max_ch4_vcis: 1, max_reg_eps 1, enable_sep 0, enable_shared_ctxs 0, do_av_insert 1
[0] MPI startup(): addrname_len: 512, addrname_firstlen: 512
[0] MPI startup(): val_max: 4096, part_len: 4095, bc_len: 1030, num_parts: 1
[1578327353.181131] [scs0027:247642:0]         select.c:410  UCX  ERROR no active messages transport to <no debug data>: mm/posix - Destination is unreachable, mm/sysv - Destination is unreachable, self/self - Destination is unreachable
[1578327353.180508] [scs0088:378614:0]         select.c:410  UCX  ERROR no active messages transport to <no debug data>: mm/posix - Destination is unreachable, mm/sysv - Destination is unreachable, self/self - Destination is unreachable
Abort(1091471) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(703)........:
MPID_Init(958)...............:
MPIDI_OFI_mpi_init_hook(1382): OFI get address vector map failed
Abort(1091471) on node 2 (rank 2 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(703)........:
MPID_Init(958)...............:
MPIDI_OFI_mpi_init_hook(1382): OFI get address vector map failed

 

Is this possibly an Intel MPI issue or something at our end (where 2018 and early 2019 versions worked OK)?

Thanks
A

0 Kudos
10 Replies
Shubham_C_Intel
Employee
6,281 Views

Hi Ade,

Thanks for reaching out to us. We are working on your issue. we will get back to you soon.

-Shubham

0 Kudos
James_T_Intel
Moderator
6,281 Views

Are you encountering this error with every program you are running, or only with certain programs?

Also, if you have installed Intel® Cluster Checker, please run

clck -f ./<nodefile> -F mpi_prereq_user

This will run diagnostic checks related to Intel® MPI Library functionality and help verify that the cluster is configured as expected.

0 Kudos
Ade_F_
Beginner
6,281 Views

It seems to be with every program, although admittedly I'm only trying noddy examples 'hello world' and a primes counting example.

All seem to work on the OFED 4.5 cluster, but fail on the OFED 4.6 cluster, when Studio 2020 is used.

Cluster checker happy except for the logical processor count as we have it enabled in BIOS but twiddled at boot on all our systems:

SUMMARY
  Command-line:   clck -F mpi_prereq_user
  Tests Run:      mpi_prereq_user
  ERROR:          2 tests encountered errors. Information may be incomplete. See
                  clck_results.log and search for "ERROR" for more information.
  Overall Result: 1 issue found - FUNCTIONALITY (1)
--------------------------------------------------------------------------------
2 nodes tested:         cdcs[0003-0004]
0 nodes with no issues:
2 nodes with issues:    cdcs[0003-0004]
--------------------------------------------------------------------------------
FUNCTIONALITY
The following functionality issues were detected:
  1. There is a mismatch between number of available logical cores and maximum
     logical cores. Cores '40-79' are offline.
       2 nodes: cdcs[0003-0004]

HARDWARE UNIFORMITY
No issues detected.

PERFORMANCE
No issues detected.

SOFTWARE UNIFORMITY
No issues detected.

See clck_results.log for more information.

0 Kudos
drMikeT
New Contributor I
6,281 Views

Hello Ade,

 

Have you tried to measure the performance "mlx" provider with MOFED 4.5? Can you run the standard "IMB" or OSU benchmarks? 

Have you tried any other MPI stacks? OpenMPI is available with MOFED distributions and you can quickly try any of these benchmarks that come prebuilt.

 

regards

Michael

 

0 Kudos
Ade_F_
Beginner
6,281 Views

Hi Michael et al.

We only have this problem with 2020.  2019, 2018, OpenMPI, MPICH, Mellanox's HPCX OpenMPI all OK. 

I have now - I think - isolated it to something between the mlx FI_PROVIDER and the MLNX_OFED 4.6 we have.  Setting the provider to verbs appears to cure the problem, although is perhaps less than ideal.  Equally the mlx provider has no issue on the MLNX_OFED 4.5 deployments we have.

Michael - if you are interested in performance separately - rather than just making it work - I can provide some IMB output.

Cheers

Ade

0 Kudos
drMikeT
New Contributor I
6,281 Views

Ade, 

In my tests, verbs provider offers 2-3GB/s at best which is really not good (6X below line speed for EDR).

Is your CPU Zen2 or Intel based? 

Sure I can see some numbers :)

regards

Michael

 

 

0 Kudos
0__Dops0
Beginner
6,281 Views

I have the same problem and my architecture is AMD 7002 series (same behavior with Epyc 7000 series too when using more than 45 PPN) and running CentOS 7.6. The MLX provider doesn't work with 2019 U6. When using 2019 U5 and the default provider, I believe it is RxM, crashes  when using more than 80 PPN i.e If I use 80 or less PPN and 9 nodes it works without errors. Not sure what is going on. 

 

Error with 2019 U5 when using more than 80 PPN on 7002 series or 45 PPN on 7000 series:

 

MPIDI_OFI_send_lightweight_request:
(unknown)(): Other MPI error

 

Error with 2019 U6 on 7002 series with MLX FI_PROVIDER:

MPIDI_OFI_send_lightweight_request:
(unknown)(): Other MPI error

and an ADDR_INFO error

Furthermore, when using the MLX provider f_info returns an error -61

 

 

 

0 Kudos
Dmitry_S_Intel
Moderator
5,837 Views
0 Kudos
AThar2
Beginner
5,234 Views

@Dmitry_S_Intel  That works  for me thanks.

For me the problem only occurred when I launched above 10 nodes

But what does your suggestion it mean - the last thing I want is having my nodes running over the ethernet connection? Can you please explain whether that is the case?

 

 

0 Kudos
solaremg
Beginner
3,922 Views

Hello, how was this variable implemented in the script? As seen below? I am also receiving the "OFI get address vector map failed" error

export UCX_TLS=ud,sm,self

0 Kudos
Reply