Intel® oneAPI Base Toolkit
Support for core tools and libraries to build and deploy high-performance data-centric applications
273 Discussions

I need help enable MPI multi-ep support

dlguswo333
Beginner
807 Views

Hi. I am trying to enable multiple endpoint support with intel MPI.

I installed the oneapi toolkit with:

 

wget https://registrationcenter-download.intel.com/akdlm/irc_nas/17427/l_HPCKit_p_2021.1.0.2684_offline.sh

 

 

and followed its instruction.

I also set environment variables with "setvars.sh" and "source ./env/vars.sh -i_mpi_library_kind=release_mt",

so I can execute mpiicc compiler and mpirun.

However, I have troubles with enabling multiple endpoints.

I compiled the code with the tag "OpenMP Runtime - Implicit Submodel"

which can be found at:

https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/add...

I could compile the code, and when I ran the code with "mpirun -n 2 ./a.out" and debug info enabled,

 

[0] MPI startup(): Intel(R) MPI Library, Version 2021.1  Build 20201112 (id: b9c9d2fc5)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release_mt
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: mlx
[0] MPI startup(): Rank    Pid      Node name  Pin cpu
[0] MPI startup(): 0       2800     lhjtb      {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,40,41,42,43,44,45,46,47,48,49,
                                 50,51,52,53,54,55,56,57,58,59}
[0] MPI startup(): 1       2801     lhjtb      {20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,60,61,62,63,64,65,66
                                 ,67,68,69,70,71,72,73,74,75,76,77,78,79}
[0] MPI startup(): WARNING: release_mt library was used but no multi-ep feature was enabled. Please use release library instead.
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 1
[0] MPI startup(): threading: app_threads: 1
[0] MPI startup(): threading: runtime: generic
[0] MPI startup(): threading: is_threaded: 1
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: num_pools: 64
[0] MPI startup(): threading: lock_level: global
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 0
[0] MPI startup(): threading: zero_op_flags: 1
[0] MPI startup(): threading: num_am_buffers: 1
[0] MPI startup(): threading: library is built with per-vci thread granularity
Thread 0: allreduce returned 0
Thread 1: allreduce returned 2
Thread 0: allreduce returned 0
Thread 1: allreduce returned 2

 

As you can see, the program runs normally, except that it prints out that the mutl-ep is not enabled.

As I understand, multi-ep support comes with psm2 and "thread_split_model".

So I exported an environment variable "I_MPI_THREAD_SPLIT" to 1,

then I got the following unknown error:

 

[0] MPI startup(): Intel(R) MPI Library, Version 2021.1  Build 20201112 (id: b9c9d2fc5)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release_mt
MPI startup(): tcp:tcp fabric is unknown or has been removed from the product, please use ofi or shm:ofi instead.
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: mlx
[cli_1]: write_line: message string doesn't end in newline: :cmd=put kvsname=kvs_2811_0 key=bc-1-seg-1/2 value=mpi#0017FDBB3959648C7040083E167DE5C93ABD60A7D377CC2B32004C3E5077CCAB33004F430090000B0000030000D01B0A00F00000000041083E167DE5C93ABD608D5377CC2B32004C3E5077CCAB33004F43008801003300000000002200637577CC2B3200F8D74F00000000004F0300885D1EA069BEB8618424883E167DE5C93ABD60478B95BFD63400242E5077CCAB330092010090000B0080000000001B0A00F0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

 

I tried to find the solution for this, but there was little info found on the internet.

Also, I found that the provider listed in the output was not psm2.

So I exported an environmental value "I_MPI_OFI_PROVIDER" to PSM2, then I got the following error:

 

[0] MPI startup(): Intel(R) MPI Library, Version 2021.1  Build 20201112 (id: b9c9d2fc5)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release_mt
[0] MPI startup(): libfabric version: 1.11.0-impi
Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init_thread: Other MPI error, error stack:
MPIR_Init_thread(138)........:
MPID_Init(1141)..............:
MPIDI_OFI_mpi_init_hook(1167): OFI addrinfo() failed (ofi_init.c:1167:MPIDI_OFI_mpi_init_hook:No data available)

 

 

To sum up, my questions are:

1. How can I enable multi-ep support?

2. If I need to enable thread split model, how can I solve the error above?

My environments is Ubuntu 18.04.5 Docker container running on Ubuntu 20.04,

and the docker container cannot see the infiniband device. I did this on purpose,

since the MPI jobs with multiple nodes slow down with the infiniband device visible. I dont know why.

Anyway, since the above outputs were generated with MPI jobs with only one node, I guess that is not the problem?

Could you please help me with this? Thank you.

0 Kudos
1 Solution
PrasanthD_intel
Moderator
767 Views

Hi Lee,


Thanks for reaching out to us.

PSM2 is the provider for Omnipath interconnect and you need to have the Omnipath hardware for it to work and hence the error.

We are not sure currently if the Multiple Endpoints feature is supported on Infiniband hardware.

We will let you know soon.

If you have Omnipath hardware please check on that cluster.


Regards

Prasanth


View solution in original post

6 Replies
PrasanthD_intel
Moderator
768 Views

Hi Lee,


Thanks for reaching out to us.

PSM2 is the provider for Omnipath interconnect and you need to have the Omnipath hardware for it to work and hence the error.

We are not sure currently if the Multiple Endpoints feature is supported on Infiniband hardware.

We will let you know soon.

If you have Omnipath hardware please check on that cluster.


Regards

Prasanth


dlguswo333
Beginner
751 Views

Thank you for your reply, Prasanth.

After some brute-forcing, 

I found that setting

 

export I_MPI_THREAD_SPLIT=1
export I_MPI_OFI_PROVIDER=TCP

 

the two above, since the Intel document  says explicitly that I cannot use multi-ep with MLX provider, but implicitly says that I can use TCP with multi-ep enabled.

 

Then the following output is printed.

 

[0] MPI startup(): Intel(R) MPI Library, Version 2021.1  Build 20201112 (id: b9c9d2fc5)
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release_mt
[0] MPI startup(): libfabric version: 1.11.0-impi
[0] MPI startup(): libfabric provider: tcp;ofi_rxm
[0] MPI startup(): THREAD_SPLIT mode is switched on, 40 endpoints in use
...
[0] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.1.1
[0] MPI startup(): I_MPI_MPIRUN=mpirun
[0] MPI startup(): I_MPI_HYDRA_TOPOLIB=hwloc
[0] MPI startup(): I_MPI_INTERNAL_MEM_POLICY=default
[0] MPI startup(): I_MPI_OFI_PROVIDER=TCP
[0] MPI startup(): I_MPI_THREAD_SPLIT=1
[0] MPI startup(): I_MPI_DEBUG=10
[0] MPI startup(): threading: mode: direct
[0] MPI startup(): threading: vcis: 40
[0] MPI startup(): threading: app_threads: 40
[0] MPI startup(): threading: runtime: openmp
[0] MPI startup(): threading: is_threaded: 1
[0] MPI startup(): threading: async_progress: 0
[0] MPI startup(): threading: num_pools: 64
[0] MPI startup(): threading: lock_level: nolock
[0] MPI startup(): threading: enable_sep: 0
[0] MPI startup(): threading: direct_recv: 0
[0] MPI startup(): threading: zero_op_flags: 0
[0] MPI startup(): threading: num_am_buffers: 8
[0] MPI startup(): threading: library is built with per-vci thread granularity
Thread 0: allreduce returned 0
Thread 0: allreduce returned 0
Thread 1: allreduce returned 2
Thread 1: allreduce returned 2

 

Which does look great, since it says 40 endpoints are used, and the lock level is no lock.

 

However, now I am getting trouble with running MPI programs over multiple nodes.

If I try to run a program across two nodes, for sometime the program hangs, and then following output comes out.

 

[mpiexec@lhjtb] check_exit_codes (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:117): unable to run bstrap_proxy on tb3 (pid 4191, exit code 768)
[mpiexec@tb2] poll_for_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:159): check exit codes error
[mpiexec@tb2] HYD_dmx_poll_wait_for_proxy_event (../../../../../src/pm/i_hydra/libhydra/demux/hydra_demux_poll.c:212): poll for event error
[mpiexec@tb2] HYD_bstrap_setup (../../../../../src/pm/i_hydra/libhydra/bstrap/src/intel/i_hydra_bstrap.c:772): error waiting for event
[mpiexec@tb2] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:1955): error setting up the boostrap proxies

 

I dont know what is the cause to this problem.

 

Maybe because I had hidden infiniband inside docker containers? but I am using TCP provider.

Or because I am using non-standard ssh port on docker container? maybe...

But I had configured the host in the file ~/.ssh/config with the correct port number.

 

May I ask your help once more? I appreciated your reply and concern. Thank you.

dlguswo333
Beginner
734 Views

After another brute-forcing, I was able to run with TCP provider

after telling mpirun what network interface to use by "-iface".

I still do not get it because I thought mpirun would automatically find what network interface to use.

 

Anyway, for now the problems are solved, but I am not sure whether the multi-ep is actually enabled with TCP provider, as Prasanth said multi-ep support is available with PSM2 library.

So I will run some tests and get back later. Thank you.

PrasanthD_intel
Moderator
691 Views

Hi Lee,


Glad your issue is resolved.

There seems to be a miscommunication here as I haven't said that only PSM2 supports the Multiple-Endpoint feature.

I have described why you have got that error in initial question when you try to use PSM2 provider with Infiniband interconnect.

Do your systems are also connected through ethernet?


Regards

Prasanth


dlguswo333
Beginner
685 Views

Hello Prasanth.

 

The following posts I had uploaded are just-like documents on my own hoping it would help me or others on the internet.

 

As you said it was not sure whether it is possible to enable multi-ep with infiniband,

I tried to find the answer to that question by myself, and the results are those posts.

 

And it seems like the feature can be accomplished even without OFI, even if muti-ep feature is backed with psm2 library.

 

And yes, the hosts are also connected with ethernets.

 

So far, the MPI runs great, except it hangs when # of data to receive or send is too big, like sending 1,073,741,824 MPI_INTs.

 

I am thinking about closing the thread soon enough.

I will make a new thread if I encounter another problem.

 

Thanks again, Prasanth. 

Heinrich_B_Intel
Employee
582 Views

Hi Hyeonjae,


sorry for late reply. The multiple endpoint feature is not supported for the mlx infiniband provider. You may be able to use it with the verbs provider that should work fine when working with not more than 32 nodes.


Please set:


$ export FI_PROVIDER=verbs


and try again.


best regards,

Heinrich



Reply