Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

MPI/IB Jobstart hanging

bwiegers
Beginner
524 Views
Hello to All,

I'm trying to start an MPI Job with a setup
SLES10SP2, IntelMPI3.0.1, OFED 1.2.5.4

When I try to run a Job with these env's:

export I_MPI_DEVICE=rdssm:OpenIB-cma
export I_MPI_DEBUG=256

My Job trys to start, but then hangs setting up the communication (see below).

dapltest works on the other hand.

inodeXYZ is the IPoIB adress of a node

Whats wrong?

Bert


running mpdallexit on inode052
LAUNCHED mpd on inode052 via
RUNNING: mpd on inode052
LAUNCHED mpd on inode022 via inode052
RUNNING: mpd on inode022
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_M59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
[0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MI_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
PI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [0] check_one_device(): attributes for device:
I_MPI: [0] check_one_device(): NEEDS_LDAT MAYBE
I_MPI: [0] check_one_device(): HAS_COLLECTIVES (null)
I_MPI: [0] check_one_device(): I_MPI_LIBRARY_VERSION 3.0
I_MPI: [0] check_one_device(): I_MPI_VERSION_DATE_OF_BUILD Wed Apr 11 18:33:38 MSD 2007
I_MPI: [0] check_one_device(): I_MPI_VERSION_PKGNAME_UNTARRED mpi_src.32.svsmpi004.20070411
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_NAME_CVS_ID ./BUILD_MPI.sh version: BUILD_MPI.sh,v 1.77 2007/03/30 09:59:39 Exp $
I_MPI: [0] check_one_device(): I_MPI_VERSION_MY_CMD_LINE ./BUILD_MPI.sh -pkg_name mpi_src.32.svsmpi004.20070411.tar.gz -explode -explode_dirname mpi2.32e.svsmpi020.20070411 -all -copyout -noinstall
I_MPI: [0] check_one_device(): I_MPI_VERSION_MACHINENAME svsmpi020
I_MPI: [0] check_one_device(): I_MPI_DEVICE_VERSION 3.0.20070411
I_MPI: [0] check_one_device(): I_MPI_GCC_VERSION 3.4.4 20050721 (Red Hat 3.4.4-2)
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PROVIDER = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST_SUFFIX = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_HOST = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_IP_ADDR = NULL
I_MPI: [0] set_up_devices(): I_MPI_DAPL_PORT = NULL
I_MPI: [7] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [7] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [3] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [3] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [1] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [1] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [11] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [11] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [5] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [5] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [0] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [0] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [13] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [13] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [2] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [2] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [10] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [10] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [4] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [4] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [15] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [15] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [12] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [12] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [8] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [8] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [14] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [14] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [9] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [9] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [6] I_MPI_dlopen_dat(): trying to dlopen default -ldat: libdat.so
I_MPI: [6] my_dlopen(): trying to dlopen: libdat.so
I_MPI: [10] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [5] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [7] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [11] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [3] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [1] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [13] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [14] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [2] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [15] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [4] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [9] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [12] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [0] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [6] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [8] MPIDI_CH3I_RDMA_init(): will use DAPL provider : OpenIB-cma
I_MPI: [0] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [0] LIBRARY pinning(): The process is pinned on node052:CPU00
I_MPI: [0] MPI_Init: The process (pid=1070) started on node052
I_MPI: [1] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [4] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [4] LIBRARY pinning(): The process is pinned on node052:CPU02
I_MPI: [4] MPI_Init: The process (pid=1064) started on node052
I_MPI: [1] LIBRARY pinning(): The process is pinned on node022:CPU00
I_MPI: [1] MPI_Init: The process (pid=30385) started on node022
I_MPI: [2] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [2] LIBRARY pinning(): The process is pinned on node052:CPU01
I_MPI: [2] MPI_Init: The process (pid=1063) started on node052
I_MPI: [5] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [6] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [6] LIBRARY pinning(): The process is pinned on node052:CPU03
I_MPI: [6] MPI_Init: The process (pid=1065) started on node052
I_MPI: [5] LIBRARY pinning(): The process is pinned on node022:CPU02
I_MPI: [5] MPI_Init: The process (pid=30387) started on node022
I_MPI: [3] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [3] LIBRARY pinning(): The process is pinned on node022:CPU01
I_MPI: [3] MPI_Init: The process (pid=30386) started on node022
I_MPI: [7] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [7] LIBRARY pinning(): The process is pinned on node022:CPU03
I_MPI: [7] MPI_Init: The process (pid=30388) started on node022
I_MPI: [8] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [8] LIBRARY pinning(): The process is pinned on node052:CPU04
I_MPI: [8] MPI_Init: The process (pid=1066) started on node052
I_MPI: [9] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [9] LIBRARY pinning(): The process is pinned on node022:CPU04
I_MPI: [9] MPI_Init: The process (pid=30389) started on node022
I_MPI: [10] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [13] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [10] LIBRARY pinning(): The process is pinned on node052:CPU05
I_MPI: [10] MPI_Init: The process (pid=1067) started on node052
I_MPI: [12] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [12] LIBRARY pinning(): The process is pinned on node052:CPU06
I_MPI: [12] MPI_Init: The process (pid=1068) started on node052
I_MPI: [13] LIBRARY pinning(): The process is pinned on node022:CPU06
I_MPI: [13] MPI_Init: The process (pid=30391) started on node022
I_MPI: [11] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [11] LIBRARY pinning(): The process is pinned on node022:CPU05
I_MPI: [11] MPI_Init: The process (pid=30390) started on node022
I_MPI: [14] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [14] LIBRARY pinning(): The process is pinned on node052:CPU07
I_MPI: [14] MPI_Init: The process (pid=1069) started on node052
I_MPI: [15] MPIDI_CH3_Init(): will use rdssm configuration
I_MPI: [15] LIBRARY pinning(): The process is pinned on node022:CPU07
I_MPI: [15] MPI_Init: The process (pid=30392) started on node022
I_MPI: [1] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 0:inode052 because there is coupling of protocols
I_MPI: [5] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 4:inode052 because there is coupling of protocols
I_MPI: [7] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 6:inode052 because there is coupling of protocols
I_MPI: [13] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 12:inode052 because there is coupling of protocols
I_MPI: [3] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 2:inode052 because there is coupling of protocols
I_MPI: [15] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 14:inode052 because there is coupling of protocols
I_MPI: [11] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 10:inode052 because there is coupling of protocols
I_MPI: [9] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 8:inode052 because there is coupling of protocols
I_MPI: [6] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 5:inode022 because there is coupling of protocols
I_MPI: [14] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 13:inode022 because there is coupling of protocols
I_MPI: [2] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 1:inode022 because there is coupling of protocols
I_MPI: [4] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 3:inode022 because there is coupling of protocols
I_MPI: [12] MPIDI_CH3I_RDMA_wait_connect(): I_MPI: [8] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 7:inode022 because there is coupling of protocols
I_MPI: [10] MPIDI_CH3I_RDMA_wait_connect(): [inode052] rejecting CR from 9:inode022 because there is coupling of protocols
[inode052] rejecting CR from 11:inode022 because there is coupling of protocols
I_MPI: [15] MPIDI_CH3I_RDMA_wait_connect(): [inode022] rejecting CR from 0:inode052 because there is coupling of protocols

0 Kudos
4 Replies
Dmitry_K_Intel2
Employee
524 Views
Hi Bert,

Could you try to disable dynamic connections (-genv I_MPI_USE_DYNAMIC_CONNECTIONS off)?

It's impossible to identify the reason of hanging from the output. I'd recommend to update OFED up to version 1.4.2 (from openfabrics.org) and update Intel MPI Library up to version 3.2.1.009.

Best wishes,
Dmitry

0 Kudos
bwiegers
Beginner
524 Views
Hi Dmitry,

thanks for this.
I'd wish to update the stacks, but that will take time...
With Your switch the job shows up with no errormessage - but still hangs.

So I bet I have to wait for the update.

Bert

Quoting - Dmitry Kuzmin (Intel)
Hi Bert,

Could you try to disable dynamic connections (-genv I_MPI_USE_DYNAMIC_CONNECTIONS off)?

It's impossible to identify the reason of hanging from the output. I'd recommend to update OFED up to version 1.4.2 (from openfabrics.org) and update Intel MPI Library up to version 3.2.1.009.

Best wishes,
Dmitry


0 Kudos
Dmitry_K_Intel2
Employee
524 Views
Hi Bert,

Are you sure that there is no error in your application? You can try out a test from Intel MPI Library located in the test directory.

Best wishes,
Dmitry

0 Kudos
bwiegers
Beginner
524 Views
Not 100%.
But it's running fine with GbE.

Bert
0 Kudos
Reply