Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2166 Discussions

Infiniband-Intel MPI Performance

jriocaton_es
Beginner
711 Views
Dear colleagues,

new doubts testing infiniband performance. The results are the same for both of them, gigabit and infiniband. Could you please help me ? Thanks

--> Infiniband

[root@cn035 ~]# time -p /soft/intel/impi/3.2.1.009/bin64/mpiexec -np 16 -env I_MPI_DEBUG 2 -env I_MPI_DEVICE rdma /home/c/OMB-3.1.1/osu_mbw_mr
[0] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[8] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[1] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[7] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[2] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[5] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[11] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[3] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[15] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[13] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[4] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[6] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[14] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[9] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[12] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[10] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[0] MPI startup(): RDMA data transfer mode
[2] MPI startup(): RDMA data transfer mode
[5] MPI startup(): RDMA data transfer mode
[1] MPI startup(): RDMA data transfer mode
[4] MPI startup(): RDMA data transfer mode
[6] MPI startup(): RDMA data transfer mode
[10] MPI startup(): RDMA data transfer mode
[11] MPI startup(): RDMA data transfer mode
[12] MPI startup(): RDMA data transfer mode
[9] MPI startup(): RDMA data transfer mode
[13] MPI startup(): RDMA data transfer mode
[7] MPI startup(): RDMA data transfer mode
[3] MPI startup(): RDMA data transfer mode
[8] MPI startup(): RDMA data transfer mode
[15] MPI startup(): RDMA data transfer mode
[14] MPI startup(): RDMA data transfer mode
[8] MPI Startup(): process is pinned to CPU00 on node cn038
[6] MPI Startup(): process is pinned to CPU03 on node cn035
[9] MPI Startup(): process is pinned to CPU04 on node cn038
[10] MPI Startup(): process is pinned to CPU02 on node cn038
[7] MPI Startup(): process is pinned to CPU07 on node cn035
[11] MPI Startup(): process is pinned to CPU06 on node cn038
[0] MPI Startup(): process is pinned to CPU00 on node cn035
# OSU MPI Multiple Bandwidth / Message Rate Test v3.1.1
# [ pairs: 8 ] [ window size: 64 ]
# Size MB/s Messages/s
[12] MPI Startup(): process is pinned to CPU01 on node cn038
[5] MPI Startup(): process is pinned to CPU05 on node cn035
[1] MPI Startup(): process is pinned to CPU04 on node cn035
[13] MPI Startup(): process is pinned to CPU05 on node cn038
[14] MPI Startup(): process is pinned to CPU03 on node cn038
[4] MPI Startup(): process is pinned to CPU01 on node cn035
[3] MPI Startup(): process is pinned to CPU06 on node cn035
[2] MPI Startup(): process is pinned to CPU02 on node cn035
[15] MPI Startup(): process is pinned to CPU07 on node cn038
1 8.60 8600599.34
2 16.82 8409961.42
4 31.48 7870853.42
8 33.62 4202594.27
16 102.57 6410590.31
32 236.60 7393643.13
64 473.13 7392625.04
128 712.43 5565880.43
256 556.57 2174094.57
512 779.38 1522217.56
1024 1011.18 987485.01
2048 1129.40 551462.61
4096 1241.51 303104.13
8192 1332.10 162610.42
16384 1323.28 80766.63
32768 1416.24 43220.23
65536 1419.74 21663.56
131072 1428.36 10897.51
262144 1434.37 5471.68
524288 1434.59 2736.27
1048576 1433.45 1367.05
2097152 1404.73 669.83
4194304 1404.14 334.77

--> Gigabit

[root@cn035 ~]# time -p /soft/intel/impi/3.2.1.009/bin64/mpiexec -np 16 -env I_MPI_DEBUG 2 /home/c/OMB-3.1.1/osu_mbw_mr
[6] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[5] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[0] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[1] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[10] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[7] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[4] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[2] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[3] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[13] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[8] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[9] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[11] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[12] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[14] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[15] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
[0] MPI startup(): RDMA, shared memory, and socket data transfer modes
[1] MPI startup(): RDMA, shared memory, and socket data transfer modes
[2] MPI startup(): RDMA, shared memory, and socket data transfer modes
[3] MPI startup(): RDMA, shared memory, and socket data transfer modes
[4] MPI startup(): RDMA, shared memory, and socket data transfer modes
[5] MPI startup(): RDMA, shared memory, and socket data transfer modes
[7] MPI startup(): RDMA, shared memory, and socket data transfer modes
[8] MPI startup(): RDMA, shared memory, and socket data transfer modes
[9] MPI startup(): RDMA, shared memory, and socket data transfer modes
[10] MPI startup(): RDMA, shared memory, and socket data transfer modes
[11] MPI startup(): RDMA, shared memory, and socket data transfer modes
[12] MPI startup(): RDMA, shared memory, and socket data transfer modes
[13] MPI startup(): RDMA, shared memory, and socket data transfer modes
[6] MPI startup(): RDMA, shared memory, and socket data transfer modes
[14] MPI startup(): RDMA, shared memory, and socket data transfer modes
[15] MPI startup(): RDMA, shared memory, and socket data transfer modes
[1] MPI Startup(): process is pinned to CPU04 on node cn035
[0] MPI Startup(): process is pinned to CPU00 on node cn035
# OSU MPI Multiple Bandwidth / Message Rate Test v3.1.1
# [ pairs: 8 ] [ window size: 64 ]
# Size MB/s Messages/s
[2] MPI Startup(): process is pinned to CPU02 on node cn035
[7] MPI Startup(): process is pinned to CPU07 on node cn035
[11] MPI Startup(): process is pinned to CPU06 on node cn038
[3] MPI Startup(): process is pinned to CPU06 on node cn035
[6] MPI Startup(): process is pinned to CPU03 on node cn035
[10] MPI Startup(): process is pinned to CPU02 on node cn038
[13] MPI Startup(): process is pinned to CPU05 on node cn038
[5] MPI Startup(): process is pinned to CPU05 on node cn035
[14] MPI Startup(): process is pinned to CPU03 on node cn038
[4] MPI Startup(): process is pinned to CPU01 on node cn035
[12] MPI Startup(): [8] MPI Startup(): process is pinned to CPU00 on node cn038
process is pinned to CPU01 on node cn038
[9] MPI Startup(): process is pinned to CPU04 on node cn038
[15] MPI Startup(): process is pinned to CPU07 on node cn038
1 8.44 8436060.84
2 15.39 7695694.85
4 29.90 7474447.98
8 66.10 8262094.68
16 102.29 6392842.49
32 236.52 7391352.82
64 473.65 7400777.64
128 712.73 5568189.51
256 555.32 2169219.22
512 777.95 1519428.06
1024 1005.73 982160.29
2048 1131.15 552319.28
4096 1240.71 302906.61
8192 1328.67 162191.63
16384 1372.56 83774.33
32768 1407.69 42959.17
65536 1421.51 21690.46
131072 1427.63 10891.95
262144 1433.66 5468.99
524288 1434.69 2736.46
1048576 1433.19 1366.80
2097152 1404.70 669.81
4194304 1404.09 334.76

--> /etc/dat.conf
[root@cn035 ~]# more /etc/dat.conf
ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" ""
ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" ""
ofa-v2-mthca0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 1" ""
ofa-v2-mthca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 2" ""
ofa-v2-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" ""
ofa-v2-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" ""
ofa-v2-ipath0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 1" ""
ofa-v2-ipath0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 2" ""
ofa-v2-ehca0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ehca0 1" ""
ofa-v2-iwarp u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" ""
OpenIB-cma u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib0 0" ""
OpenIB-cma-1 u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "ib1 0" ""
OpenIB-mthca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 1" ""
OpenIB-mthca0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mthca0 2" ""
OpenIB-mlx4_0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 1" ""
OpenIB-mlx4_0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "mlx4_0 2" ""
OpenIB-ipath0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 1" ""
OpenIB-ipath0-2 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ipath0 2" ""
OpenIB-ehca0-1 u1.2 nonthreadsafe default libdaplscm.so.1 dapl.1.2 "ehca0 1" ""
OpenIB-iwarp u1.2 nonthreadsafe default libdaplcma.so.1 dapl.1.2 "eth2 0" ""




0 Kudos
3 Replies
TimP
Honored Contributor III
711 Views
Accordiing to what you have posted, your installation defaults to rdsm, as expected for the version you have chosen. If you wish to test a non-rdma device path, you must specify it.
0 Kudos
Gergana_S_Intel
Employee
711 Views

Hello Julio,

As Tim mentions, you're actually running the application over InfiniBand* in both instances. A few releases ago, Intel MPI Library had changed the defaults to use the fastest available network on the cluster at startup (which would be InfiniBand, in your case).

Below, it seems like you specify the RDMA device in your IB run, but don't specify a device in your GigE run (which would default to IB again):

Quoting - jriocaton.es
--> Infiniband

[root@cn035 ~]# time -p /soft/intel/impi/3.2.1.009/bin64/mpiexec -np 16 -env I_MPI_DEBUG 2 -env I_MPI_DEVICE rdma /home/c/OMB-3.1.1/osu_mbw_mr
[0] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
...
[0] MPI startup(): RDMA data transfer mode

--> Gigabit

[root@cn035 ~]# time -p /soft/intel/impi/3.2.1.009/bin64/mpiexec -np 16 -env I_MPI_DEBUG 2 /home/c/OMB-3.1.1/osu_mbw_mr
[0] MPI startup(): DAPL provider ofa-v2-ib0 specified in DAPL configuration file /etc/dat.conf
...
[0] MPI startup(): RDMA, shared memory, and socket data transfer modes

Since the default is the RDSSM device, your two runs are identical. If you'd like to run over GigE, you need to specify the "sock" device. Alternatively, you can use "ssm" for GigE communication between the nodes, and shared memory within a node. That means your command line would look like this:

# time -p /soft/intel/impi/3.2.1.009/bin64/mpiexec -np 16 -env I_MPI_DEBUG 2 -env I_MPI_DEVICE ssm /home/c/OMB-3.1.1/osu_mbw_mr
I hope this helps.

Regards,
~Gergana

0 Kudos
jriocaton_es
Beginner
711 Views
I am sorry for the delay, Ive been out of the office.

Thanks a lot for your help, Ill try your recomendations
0 Kudos
Reply