Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2159 Discussions

intel mpi and infiniband udapl

Rene_S_1
Beginner
1,119 Views
hi,

I am trying to use the Intel compilers and mpi libraries to run over
infiniband. From the documentation and also from all the searches I did on the Intel
forums I could not figure out what the problem might be. We are running
a small test with 8 nodes connected via infiniband. I can ping all the
nodes and startup mpd on all of then via IP over IB:

hpcp5551(salmr0)192:mpdtrace
192.168.0.1
192.168.0.5
192.168.0.4
192.168.0.3
192.168.0.2
192.168.0.8
192.168.0.7
192.168.0.6

I can run fine using the "sock" network fabric or IP over IB:
hpcp5551(salmr0)193:mpiexec -genv I_MPI_DEVICE sock -n 8 ./cpi
Process 0 on 192.168.0.1
Process 2 on 192.168.0.4
Process 1 on 192.168.0.5
Process 3 on 192.168.0.3
Process 4 on 192.168.0.2
Process 5 on 192.168.0.8
Process 6 on 192.168.0.7
Process 7 on 192.168.0.6
pi is approximately 3.1416009869231245, Error is 0.0000083333333314
wall clock time = 0.007859

The problem is when I try to run over the native IB fabric using the
"rdma" network fabric:

hpcp5551(salmr0)194:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -n 8 -env
I_MPI_DEBUG 2 ./cpi
rank 4 in job 9 192.168.0.1_35933 caused collective abort of all
ranks
exit status of rank 4: killed by signal 11
rank 1 in job 9 192.168.0.1_35933 caused collective abort of all
ranks
exit status of rank 1: killed by signal 11
rank 0 in job 9 192.168.0.1_35933 caused collective abort of all
ranks
exit status of rank 0: killed by signal 11

I have the correct entries in /etc/dat.conf:
hpcp5551:~ # tail /etc/dat.conf
# Simple (OpenIB-cma) default with netdev name provided first on list
# to enable use of same dat.conf version on all nodes
#
# Add examples for multiple interfaces and IPoIB HA fail over, and
bonding
#
OpenIB-cma u1.2 nonthreadsafe
default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "ib0 0" ""
OpenIB-cma-1 u1.2 nonthreadsafe
default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "ib1 0" ""
OpenIB-cma-2 u1.2 nonthreadsafe
default /usr /local/ofed/lib64/libdaplcma.so dapl.1.2 "ib2 0" ""
OpenIB-cma-3 u1.2 nonthreadsafe
default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "ib3 0" ""
OpenIB-bond u1.2 nonthreadsafe
default /usr/local/ofed/lib64/libdaplcma.so dapl.1.2 "bond0 0" ""

hpcp5551:~ # ls -l /usr/local/ofed/lib64/libdaplcma.so
lrwxrwxrwx 1 root root 19 Jan 18
17:20 /usr/local/ofed/lib64/libdaplcma.so -> libdaplcma.so.1.0.2


hpcp5551:~ # ifconfig ib0
ib0 Link encap:UNSPEC HWaddr
80-00-04-04-FE-80-00-00-00-00-00-00-00-00-00-00
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: fe80::208:f104:398:2999/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:65520 Metric:1
RX packets:851583 errors:0 dropped:0 overruns:0 frame:0
TX packets:824427 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:128
RX bytes:11834748000 (11286.4 Mb) TX bytes:11786736324
(11240.7 Mb)

Is there any way to get mode debug or verbose messages out of mpiexec or
mpirun so that it can maybe provide me with a hit as to what the problem
might be?

This is with OFED 1.2.5.4

Thanks
Rene
0 Kudos
7 Replies
TimP
Honored Contributor III
1,119 Views
export I_MPI_DEBUG=2 (or whatever level of verbosity you want)
0 Kudos
Rene_S_1
Beginner
1,119 Views
Thanks for the reply. I guess I should have mentioned that on my post.
I did try the I_MPI_DEBUG 2 option with various levels but don't seem to get any more info that what I originally posted.

hpcp5551(salmr0)196:setenv I_MPI_DEBUG 2
hpcp5551(salmr0)197:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -n 8 ./cpi
rank 3 in job 11 192.168.0.1_35933 caused collective abort of all ranks
exit status of rank 3: killed by signal 11

hpcp5551(salmr0)198:setenv I_MPI_DEBUG 4
hpcp5551(salmr0)199:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -n 8 ./cpi
rank 3 in job 12 192.168.0.1_35933 caused collective abort of all ranks
exit status of rank 3: killed by signal 11


hpcp5551(salmr0)200:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -n 8 -env I_MPI_DEBUG 3 ./cpi
rank 0 in job 13 192.168.0.1_35933 caused collective abort of all ranks
exit status of rank 0: killed by signal 11


Any other ideas? Is ther a way to check if I have the right updapl libs installed other then looking for /usr/local/ofed/lib64/libdaplcma.so?


Thanks
Rene


0 Kudos
Andrey_D_Intel
Employee
1,119 Views

Hi Rene,

Did you able to run dapltest program on your cluster? Do I understand right that you did not get additional debug information even if cpi was linked against debug version of MPI library?

Best regards,

Andrey

0 Kudos
Rene_S_1
Beginner
1,119 Views
Hi,

i guess i was not asking for enough debug info. I tried debug levels of 2,3,4 and was getting nowhere. Once i increased to level 10 or above i got a bit more useful info.

I think we found the problem. We like to compile things statically here
so we would typically do something like this:

hpcp5551(salmr0)77:mpicc -static cpi.c
hpcp5551(salmr0)108:ldd a.out
not a dynamic executable

and this works fine and we can run it anywhere over gigabit ethernet or
using the sock interface over IB.

If we do the same and try to run over IB we get nowhere as you can see
from the previous post

But for some reason if we compile with the "-static_mpi" flag things
seem to work.

hpcp5551(salmr0)109:mpicc -static_mpi cpi.c
hpcp5551(salmr0)110:ldd a.out
librt.so.1 => /lib64/librt.so.1 (0x00002b666073b000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b6660844000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002b666095a000)
libc.so.6 => /lib64/libc.so.6 (0x00002b6660a5f000)
/lib64/ld-linux-x86-64.so.2 (0x00002b666061e000)


hpcp5551(salmr0)111:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -np 2
-env I_MPI_DEBUG 10 a.out
[0] MPI startup(): DAPL provider OpenIB-cma
[1] MPI startup(): DAPL provider OpenIB-cma
[0] MPI startup(): RDMA data transfer mode
[0] MPI Startup(): process is pinned to CPU00 on node hpcp5551
[1] MPI startup(): RDMA data transfer mode
[1] MPI Startup(): process is pinned to CPU00 on node hpcp5555
Process 1 on 192.168.0.5
Process 0 on 192.168.0.1
[0] Rank Pid Pin cpu Node name
[0] 0 7515 0 hpcp5551
[0] 1 5192 0 hpcp5555
[0] Init(): I_MPI_DEBUG=10
[0] Init(): I_MPI_DEVICE=rdma
pi is approximately 3.1416009869231241, Error is 0.0000083333333309
wall clock time = 0.000111



The only problem is the a.out executable is really not static it still
had the need for some libs to be loaded dynamically. What are the flags
or options we need to generate a true static executable that would run
over IB?

thanks
Rene
0 Kudos
TimP
Honored Contributor III
1,119 Views
mpicc -static should have the same effect as gcc -static in choosing static versions of libraries known to gcc. As you figured out, -static_mpi controls the choice of Intel mpi libraries. According to your stated requirement, you would want to use both options.
0 Kudos
Rene_S_1
Beginner
1,119 Views
Hi,

thanks for the reply. Yes I can compile using both flags just fine but if I do that I can not loger run the executable over IB. Here is an example.

Compile semi statically just using -static_mpi works fine:
----------------------------------------------------------
hpcp5551(salmr0)140:mpicc -static_mpi cpi.c
hpcp5551(salmr0)141:ldd a.out
librt.so.1 => /lib64/librt.so.1 (0x00002b3805bbe000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00002b3805cc7000)
libdl.so.2 => /lib64/libdl.so.2 (0x00002b3805ddd000)
libc.so.6 => /lib64/libc.so.6 (0x00002b3805ee2000)
/lib64/ld-linux-x86-64.so.2 (0x00002b3805aa1000)
hpcp5551(salmr0)142:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -np 2 -env I_MPI_DEBUG 10 a.out
[0] MPI startup(): DAPL provider OpenIB-cma
[1] MPI startup(): DAPL provider OpenIB-cma
[0] MPI startup(): RDMA data transfer mode
[0] MPI Startup(): process is pinned to CPU00 on node hpcp5551
[1] MPI startup(): RDMA data transfer mode
[1] MPI Startup(): process is pinned to CPU00 on node hpcp5555
Process 1 on 192.168.0.5
[0] Rank Pid Pin cpu Node name
[0] 0 23443 0 hpcp5551
[0] 1 19241 0 hpcp5555
[0] Init(): I_MPI_DEBUG=10
[0] Init(): I_MPI_DEVICE=rdma
Process 0 on 192.168.0.1
pi is approximately 3.1416009869231241, Error is 0.0000083333333309
wall clock time = 0.000159


Now we compile using both flags -static_mpi and -static does not run:
--------------------------------------------------------------------------------------
hpcp5551(salmr0)144:mpicc -static_mpi -static cpi.c /opt/intel/impi/3.1/lib64/libmpi.a(I_MPI_wrap_dat.o): In function `I_MPI_dlopen_dat':
I_MPI_wrap_dat.c:(.text+0x30f): warning: Using 'dlopen' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/opt/intel/impi/3.1/lib64/libmpi.a(rdma_iba_util.o): In function `get_addr_by_host_name':
rdma_iba_util.c:(.text+0x21a): warning: Using 'getaddrinfo' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/opt/intel/impi/3.1/lib64/libmpi.a(sock.o): In function `MPIDU_Sock_get_host_description':
sock.c:(.text+0x5956): warning: Using 'gethostbyaddr' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
/opt/intel/impi/3.1/lib64/libmpi.a(simple_pmi.o): In function `PMII_Connect_to_pm':
simple_pmi.c:(.text+0x29a8): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
hpcp5551(salmr0)145:
hpcp5551(salmr0)145:ldd a.out
not a dynamic executable
hpcp5551(salmr0)146:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -np 2 -env I_MPI_DEBUG 10 a.out
rank 1 in job 18 192.168.0.1_54412 caused collective abort of all ranks
exit status of rank 1: killed by signal 11


as you can see the executable does not run when compiled staticaly. Here a more vebose output from debug=100

hpcp5551(salmr0)147:mpiexec -genv I_MPI_DEVICE rdma:OpenIB-cma -np 2 -env I_MPI_DEBUG 100 a.out
[0] MPI startup(): attributes for device:
[0] MPI startup(): NEEDS_LDAT MAYBE
[0] MPI startup(): HAS_COLLECTIVES (null)
[0] MPI startup(): I_MPI_LIBRARY_VERSION 3.1
[0] MPI startup(): I_MPI_VERSION_DATE_OF_BUILD Fri Oct 5 15:41:02 MSD 2007
[0] MPI startup(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20071005
[0] MPI startup(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.102 2007/09/13 07:41:42 Exp $
[0] MPI startup(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20071005.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20071005 -all -copyout -noinstall
[0] MPI startup(): I_MPI_VERSION_MACHINENAME svsmpi020
[0] MPI startup(): I_MPI_DEVICE_VERSION 3.1.20071005
[0] MPI startup(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
[0] MPI startup(): I_MPI_ICC_VERSION Version 9.1 Beta Build 20060131 Package ID: l_cc_bc_9.1.023
[0] MPI startup(): I_MPI_IFORT_VERSION Version 9.1 Beta Build 20060131 Package ID: l_fc_bc_9.1.020
[0] MPI startup(): attributes for device:
[0] MPI startup(): NEEDS_LDAT MAYBE
[0] MPI startup(): HAS_COLLECTIVES (null)
[0] MPI startup(): I_MPI_LIBRARY_VERSION 3.1
[0] MPI startup(): I_MPI_VERSION_DATE_OF_BUILD Fri Oct 5 15:41:02 MSD 2007
[0] MPI startup(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20071005
[0] MPI startup(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.102 2007/09/13 07:41:42 Exp $
[0] MPI startup(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20071005.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20071005 -all -copyout -noinstall
[0] MPI startup(): I_MPI_VERSION_MACHINENAME svsmpi020
[0] MPI startup(): I_MPI_DEVICE_VERSION 3.1.20071005
[0] MPI startup(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
[0] MPI startup(): I_MPI_ICC_VERSION Version 9.1 Beta Build 20060131 Package ID: l_cc_bc_9.1.023
[0] MPI startup(): I_MPI_IFORT_VERSION Version 9.1 Beta Build 20060131 Package ID: l_fc_bc_9.1.020
[0] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
[0] my_dlopen(): trying to dlopen: libdat.so
rank 0 in job 19 192.168.0.1_54412 caused collective abort of all ranks
exit status of rank 0: killed by signal 11


Thanks
Rene







0 Kudos
Andrey_D_Intel
Employee
1,119 Views

Rene,

You can not build true static executable that would run over IB with 100% garantee. It is due to libc runtime limitations. There isdlopen() call inside MPI library which requires presence of the same runtime on the other cluster. Probably you saw warning messages when tried the mpicc -static option.

Best regards,

Andrey

0 Kudos
Reply