Hello,
We are developing a project that uses MPI for distributed execution. We need executables both for Windows and Linux. In windows we were using Microsoft MPI and decided to switch to Intel's implementation. Unfortunately, we saw a drop in performance on some specific cases. After some investigation we found that the problem is located to MPI_Alltoallw().
After searching, I found that Intel's MPI_Alltoallw() is a naive implementation of Isend/Irecv and has no alternatives for tuning like other collectives.
To prove our results, I have created a C++ demo program using alltoallw(). Every processor contains a buffer glob_rows * cols and sends common_rows * cols to the others. That means that every processor at the end will have a common_rows * cols * comm_size buffer filled. Common_rows are calculated using cyclic_block distribution.
I compiled and ran both with Intel MPI 2019.7.216 and Microsoft MPI. The execution times on intel i5 4460 are:
#pr msmpi impi
2 1.69s 4.01s
4 3.27s 7.15s
I know that the specific demo can be solved using alltoallv or maybe even alltoall. The problem is that we use alltoallw a lot and it's performance is really important.
We didnt expect Intel's MPI to be slower than MSMPI. Do you have any tips? Is there any chance for MPI developers to improve Alltoallw()?
Thank you in advance guys!
For some reason I cannot upload my cpp file, this is the code
#include <iostream>
#include <random>
#include <mpi.h>
#include <memory>
#include <algorithm>
int main(int argc, char* argv[]){
MPI_Init(&argc,&argv);
int comm_size, comm_rank;
MPI_Comm_size(MPI_COMM_WORLD,&comm_size);
MPI_Comm_rank(MPI_COMM_WORLD,&comm_rank);
// allocate initial buffer and fill it with random doubles
int rows = 1 << 20;
int cols = 50;
size_t size = (size_t)rows * (size_t)cols;
auto val = std::make_unique<double[]>(size);
std::uniform_real_distribution<double> unif;
std::default_random_engine re;
std::generate(val.get(), val.get()+size, [&](){return unif(re);});
// calculate common rows that each processor will have
int total_blks = rows / 64;
int cm_blks = total_blks / comm_size;
int cm_rows = cm_blks * 64;
// final buffer for each processor. Each processor will receive cm_rows * cols
int cols_f = cols * comm_size;
auto b_val = std::make_unique<double[]>(cm_rows * cols_f);
// Create datatypes
MPI_Datatype scol,scol_res,sblock,rblock;
MPI_Type_vector(cm_blks,64,64*comm_size,MPI_DOUBLE,&scol);
MPI_Type_create_resized(scol,0,rows*sizeof(double),&scol_res);
MPI_Type_contiguous(cols,scol_res,&sblock);
MPI_Type_contiguous(cm_rows*cols,MPI_DOUBLE,&rblock);
MPI_Type_commit(&sblock);
MPI_Type_commit(&rblock);
std::vector<int> scounts(comm_size,1);
std::vector<int> rcounts(comm_size,1);
std::vector<int> sdispls(comm_size);
std::vector<int> rdispls(comm_size);
std::vector<MPI_Datatype> stypes(comm_size);
std::vector<MPI_Datatype> rtypes(comm_size);
for (int i=0;i<comm_size;i++){
sdispls[i] = 64*i*sizeof(double);
rdispls[i] = cm_rows*cols*i*sizeof(double);
stypes[i] = sblock;
rtypes[i] = rblock;
}
MPI_Barrier(MPI_COMM_WORLD);
double str = MPI_Wtime();
for (int i=0;i<10;i++) MPI_Alltoallw(val.get(),scounts.data(),sdispls.data(),stypes.data(),b_val.get(),rcounts.data(),rdispls.data(),rtypes.data(),MPI_COMM_WORLD);
MPI_Barrier(MPI_COMM_WORLD);
if (!comm_rank) printf("Time: %lf seconds\n",MPI_Wtime()-str);
MPI_Finalize();
return 0;
}
Link Copied
Hi Michail,
Thanks for connecting to us.
Yes, the AlltoAllw has only a Isend/Irecv waitall implementation in IMPI.
We also observed similar timings for the given program as you have reported for IMPI.
We are forwarding your query to the concerned team and will get back to you at earliest.
Regards
Prasanth
Thank you!
Hi Michail,
On how many nodes do you observe this behavior?
Best regards,
Amar
Hi DrAmarpal,
We are currently running on SMP mode on 1 node. Soon we are going to use 4-8 nodes max.
Best regards,
Michail
Hi Michail,
Thanks for confirming. Please hold on for a solution on this.
Best regards,
Amar
Hi Michail,
Could you please rerun your experiments with Intel MPI Library 2019 U8 that was recently released? With this version please set FI_PROVIDER=netdir and report your findings.
Best regards,
Amar
Hi DrAmarpal,
I downloaded Intel MPI Library 2019 U8 and compiled my code with it. Using
I_MPI_FABRICS=ofi
FI_PROVIDER=netdir
I get an error which says
[0] MPI startup(): Intel(R) MPI Library, Version 2019 Update 8 Build 20200624
[0] MPI startup(): Copyright (C) 2003-2020 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
[0] MPI startup(): libfabric version: 1.10.1a1-impi
Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)
If I use FI_PROVIDER=tcp, it works but the execution time is still large. With FI_PROVIDER=shm the execution time is quite better but not what we want compared to MS_MPI.
I am runing on a single node. The changes were focused for inter-node communication?
Best Regards,
Jason
Hi Jason,
Thanks for reporting your findings. To understand what the problem is, could you please source the debug version of the Intel MPI library by running,
mpivars.bat debug
and set FI_LOG_LEVEL=debug before running your test. Please share the additional output that gets generated during this run.
Best regards,
Amar
Dear Amar,
I followed your instructions. Running with 2 or 4 processors, the output is pretty short.
Abort(1091215) on node 1 (rank 1 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)
Using 1 processor I get the following,
Abort(1091215) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(136)........:
MPID_Init(1138)..............:
MPIDI_OFI_mpi_init_hook(1061): OFI addrinfo() failed (netmod\ofi\ofi_init.c:1061:MPIDI_OFI_mpi_init_hook:Unknown error)
libfabric:476:core:mr:ofi_default_cache_size():56<info> default cache size=0
libfabric:476:netdir:core:ofi_nd_startup():602<info> ofi_nd_startup: starting initialization
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: netdir (110.10)
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: ofi_rxm (110.10)
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: sockets (110.10)
libfabric:476:core:core:ofi_register_provider():446<info> "sockets" filtered by provider include/exclude list, skipping
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: tcp (110.10)
libfabric:476:core:core:ofi_register_provider():446<info> "tcp" filtered by provider include/exclude list, skipping
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_perf (110.10)
libfabric:476:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_noop (110.10)
libfabric:476:core:core:fi_getinfo():1066<info> Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:fi_getinfo():1066<info> Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066<info> Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066<info> Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1066<info> Found provider with the highest priority netdir, must_use_util_prov = 1
libfabric:476:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by netdir provider
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by ofi_rxm provider
libfabric:476:core:core:fi_getinfo():1129<info> Now it is being used by netdir provider
Using 4 processors and TCP I get,
libfabric:7532:core:mr:ofi_default_cache_size():56<info> default cache size=0
libfabric:7532:netdir:core:ofi_nd_startup():602<info> ofi_nd_startup: starting initialization
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: netdir (110.10)
libfabric:7532:core:core:ofi_register_provider():446<info> "netdir" filtered by provider include/exclude list, skipping
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: ofi_rxm (110.10)
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: sockets (110.10)
libfabric:7532:core:core:ofi_register_provider():446<info> "sockets" filtered by provider include/exclude list, skipping
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: tcp (110.10)
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_perf (110.10)
libfabric:7532:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_noop (110.10)
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_libfabric:2492:core:mr:ofi_default_cache_size():56<info> default cache size=0
libfabric:2492:netdir:core:ofi_nd_startup():602<info> ofi_nd_startup: starting initialization
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: netdir (110.10)
libfabric:2492:core:core:ofi_register_provider():446<info> "netdir" filtered by provider include/exclude list, skipping
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: ofi_rxm (110.10)
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: sockets (110.10)
libfabric:2492:core:core:ofi_register_provider():446<info> "sockets" filtered by provider include/exclude list, skipping
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: tcp (110.10)
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_perf (110.10)
libfabric:2492:core:core:ofi_register_provider():418<info> registering provider: ofi_hook_noop (110.10)
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:fi_getinfo():1051<warn> Can't find provider with the highest priority
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:tcp:core:ofi_check_ep_type():654<info> unsupported endpoint type
libfabric:2492:tcp:core:ofi_check_ep_type():655<info> Supported: FI_EP_MSG
libfabric:2492:tcp:core:ofi_check_ep_type():655<info> Requested: FI_EP_RDM
libfabric:2492:core:core:fi_getinfo():1129<info> Now it is being used by tcp provider
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: :addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:fi_getinfo():1051<warn> Can't find provider with the highest priority
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:core:core:ofi_layering_ok():915<info> Need core provider, skipping ofi_rxm
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:tcp:core:ofi_check_ep_type():654<info> unsupported endpoint type
libfabric:7532:tcp:core:ofi_check_ep_type():655<info> Supported: FI_EP_MSG
libfabric:7532:tcp:core:ofi_check_ep_type():655<info> Requested: FI_EP_RDM
libfabric:7532:core:core:fi_getinfo():1129<info> Now it is being used by tcp provider
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: :Time: 21.775112 seconds
fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:2492:core:core:fi_fabric():1346<info> Opened fabric: 10.0.2.0/24
libfabric:2492:core:core:fi_fabric():1346<info> Opened fabric: 10.0.2.0/24
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:av:util_verify_av_attr():474<warn> Shared AV is unsupported
libfabric:2492:ofi_rxm:av:util_av_init():446<info> AV size 1024
libfabric:2492:ofi_rxm:core:ofi_check_fabric_attr():403<info> Requesting provider verbs, skipping tcp;ofi_rxm
libfabric:2492:ofi_rxm:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:ofi_rxm:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:ofi_rxm:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():766<info> Tag size exceeds supported size
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():767<info> Supported: 6148914691236517205
libfabric:2492:ofi_rxm:core:ofi_check_ep_attr():767<info> Requested: -6148914691236517206
libfabric:2492:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:2492:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:2492:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:2492:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:core:rxm_ep_settings_init():2440<info> Settings:
MR local: MSG - 0, RxM - 0
Completions per progress: MSG - 1
Buffered min: 0
Min multi recv size: 16320
FI_EP_MSG provider inject size: 64
rxm inject size: 16320
Protocol limits: Eager: 16320, SAR: 131072
libfabric:2492:ofi_rxm:core:rxm_ep_setopt():587<info> FI_OPT_MIN_MULTI_RECV set to 16384
libfabric:2492:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:2492:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:2492:ofi_rxm:ep_ctrl:rxm_cmap_free():684<info> Closing cmap
libfabric:2492:ofi_rxm:ep_ctrl:rxm_cmap_cm_thread_close():658<info> stopping CM thread
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (568) not found in wait list - 00000000000C8210
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (560) not found in wait list - 00000000000C8210
libfabric:2492:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (564) not found in wait list - 00000000000C8210
fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:util_getinfo_ifs():312<info> Chosen addr for using: 10.0.2.15, speed 1000000000
libfabric:7532:core:core:fi_fabric():1346<info> Opened fabric: 10.0.2.0/24
libfabric:7532:core:core:fi_fabric():1346<info> Opened fabric: 10.0.2.0/24
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:av:util_verify_av_attr():474<warn> Shared AV is unsupported
libfabric:7532:ofi_rxm:av:util_av_init():446<info> AV size 1024
libfabric:7532:ofi_rxm:core:ofi_check_fabric_attr():403<info> Requesting provider verbs, skipping tcp;ofi_rxm
libfabric:7532:ofi_rxm:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:ofi_rxm:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:ofi_rxm:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():766<info> Tag size exceeds supported size
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():767<info> Supported: 6148914691236517205
libfabric:7532:ofi_rxm:core:ofi_check_ep_attr():767<info> Requested: -6148914691236517206
libfabric:7532:core:core:fi_getinfo():1066<info> Found provider with the highest priority tcp, must_use_util_prov = 1
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: 10.0.2.15, iface name: eth1, speed: 1000000000
libfabric:7532:tcp:core:ofi_get_list_of_addr():1255<info> Available addr: fe80::6166:bdf8:dbc8:9a1, iface name: eth0, speed: 1000000000
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1100<info> available addr: : fi_sockaddr_in://127.0.0.1:0
libfabric:7532:tcp:core:ofi_insert_loopback_addr():1114<info> available addr: : fi_sockaddr_in6://[::1]:0
libfabric:7532:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:core:rxm_ep_settings_init():2440<info> Settings:
MR local: MSG - 0, RxM - 0
Completions per progress: MSG - 1
Buffered min: 0
Min multi recv size: 16320
FI_EP_MSG provider inject size: 64
rxm inject size: 16320
Protocol limits: Eager: 16320, SAR: 131072
libfabric:7532:ofi_rxm:core:rxm_ep_setopt():587<info> FI_OPT_MIN_MULTI_RECV set to 16384
libfabric:7532:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:tcp:core:ofi_check_rx_attr():782<info> Tx only caps ignored in Rx caps
libfabric:7532:tcp:core:ofi_check_tx_attr():880<info> Rx only caps ignored in Tx caps
libfabric:7532:ofi_rxm:ep_ctrl:rxm_cmap_free():684<info> Closing cmap
libfabric:7532:ofi_rxm:ep_ctrl:rxm_cmap_cm_thread_close():658<info> stopping CM thread
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (564) not found in wait list - 00000000001779C0
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (580) not found in wait list - 00000000001779C0
libfabric:7532:tcp:fabric:ofi_wait_del_fd():220<info> Given fd (576) not found in wait list - 00000000001779C0
Thank you for your help
Best Regards,
Jason
Hi Jason,
Thanks for reporting your findings. Which NIC card do you have on your system? If you are using IB cards, how is IPoIB configured (v4/v6/both)?
Many thanks,
Amar
Dear Amar,
I have the following NIC,
description: Ethernet interface
product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
vendor: Realtek Semiconductor Co., Ltd.
physical id: 0
bus info: pci@0000:03:00.0
logical name: enp3s0
version: 0c
serial: 1c:1b:0d:7c:44:9e
size: 1Gbit/s
capacity: 1Gbit/s
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress msix vpd bus_master cap_list ethernet physical tp mii 10bt 10bt-fd 100bt 100bt-fd 1000bt 1000bt-fd autonegotiation
configuration: autonegotiation=on broadcast=yes driver=r8169 driverversion=2.3LK-NAPI duplex=full firmware=rtl8168g-2_0.0.1 02/06/13 ip=10.0.0.6 latency=0 link=yes multicast=yes port=MII speed=1Gbit/s
It has a static IPv4.
We do not have IB card as we dont run in multiple nodes yet.
Best Regards,
Jason
For more complete information about compiler optimizations, see our Optimization Notice.