- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
To whom it may concern,
Hello. We are using Slurm to manage our Cluster. However, we met a new issue of Intel MPI with Slurm. When one node reboots, the Intel MPI will fail with that node but manaully restart of slurm daemon will fix it. I also tried to add "service slurm restart" in /etc/rc.local which runs in the end of booting but the issue is still there.
Moreover, I submitted this issue to the slurm-dev but they believed that it was due to Infiniband+IMPI configuration. They suggested me to configure dat.conf and set up some Intel MPI variables. However, I don't know how to set them.
Here is an example:
$ salloc -N1 -n12 -w cn117 #cn117 is the node just rebooted salloc: Granted job allocation 1201 $ module list Currently Loaded Modulefiles: 1) modules 2) null 3) intelics/2013.1.039 $ export I_MPI_PMI_LIBRARY=/gpfs/slurm/lib/libpmi.so $ export I_MPI_FABRICS=shm:ofa $ srun ./hello [3] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [4] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [5] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [6] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [7] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [8] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [10] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [11] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [0] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [9] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [1] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [2] MPI startup(): ofa fabric is not available and fallback fabric is not enabled srun: error: cn117: tasks 0-11: Exited with exit code 254 srun: Terminating job step 1201.0
After restarting the slurm daemon:
$ ssh root@cn117 cn117$ service slurm restart stopping slurmd: [ OK ] slurmd is stopped starting slurmd: [ OK ] $ exit $ salloc -N1 -n12 -w cn117 salloc: Granted job allocation 1203 $ export I_MPI_PMI_LIBRARY=/gpfs/slurm/lib/libpmi.so $ export I_MPI_FABRICS=shm:ofa $ srun ./hello This is Process 9 out of 12 running on host cn117 This is Process 3 out of 12 running on host cn117 This is Process 2 out of 12 running on host cn117 This is Process 7 out of 12 running on host cn117 This is Process 6 out of 12 running on host cn117 This is Process 0 out of 12 running on host cn117 This is Process 5 out of 12 running on host cn117 This is Process 1 out of 12 running on host cn117 This is Process 4 out of 12 running on host cn117 This is Process 10 out of 12 running on host cn117 This is Process 8 out of 12 running on host cn117 This is Process 11 out of 12 running on host cn117
Here is the default dat.conf we have:
# DAT v2.0, v1.2 configuration file # # Each entry should have the following fields: # # <ia_name> <api_version> <threadsafety> <default> <lib_path> \ # <provider_version> <ia_params> <platform_params> # # For uDAPL cma provder, <ia_params> is one of the following: # network address, network hostname, or netdev name and 0 for port # # For uDAPL scm provider, <ia_params> is device name and port # For uDAPL ucm provider, <ia_params> is device name and port # For uDAPL iWARP provider, <ia_params> is netdev device name and 0 # For uDAPL iWARP provider, <ia_params> is netdev device name and 0 # For uDAPL RoCE provider, <ia_params> is device name and 0 # ofa-v2-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-ib0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib0 0" "" ofa-v2-ib1 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "ib1 0" "" ofa-v2-mthca0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 1" "" ofa-v2-mthca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mthca0 2" "" ofa-v2-ipath0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 1" "" ofa-v2-ipath0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ipath0 2" "" ofa-v2-ehca0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "ehca0 1" "" ofa-v2-iwarp u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" "" ofa-v2-mlx4_0-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-mlx4_0-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-mthca0-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mthca0 1" "" ofa-v2-mthca0-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mthca0 2" "" ofa-v2-cma-roe-eth2 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth2 0" "" ofa-v2-cma-roe-eth3 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "eth3 0" "" ofa-v2-scm-roe-mlx4_0-1 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-scm-roe-mlx4_0-2 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-mcm-1 u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-mcm-2 u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-scif0 u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "scif0 1" "" ofa-v2-scif0-u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "scif0 1" "" ofa-v2-mic0 u2.0 nonthreadsafe default libdaplofa.so.2 dapl.2.0 "mic0:ib 1" "" ofa-v2-mlx4_0-1s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-mlx4_0-2s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-mlx4_1-1s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_1 1" "" ofa-v2-mlx4_1-2s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx4_1 2" "" ofa-v2-mlx4_1-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_1 1" "" ofa-v2-mlx4_1-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx4_1 2" "" ofa-v2-mlx4_0-1m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_0 1" "" ofa-v2-mlx4_0-2m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_0 2" "" ofa-v2-mlx4_1-1m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_1 1" "" ofa-v2-mlx4_1-2m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx4_1 2" "" ofa-v2-mlx5_0-1s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx5_0 1" "" ofa-v2-mlx5_0-2s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx5_0 2" "" ofa-v2-mlx5_1-1s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx5_1 1" "" ofa-v2-mlx5_1-2s u2.0 nonthreadsafe default libdaploscm.so.2 dapl.2.0 "mlx5_1 2" "" ofa-v2-mlx5_0-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx5_0 1" "" ofa-v2-mlx5_0-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx5_0 2" "" ofa-v2-mlx5_1-1u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx5_1 1" "" ofa-v2-mlx5_1-2u u2.0 nonthreadsafe default libdaploucm.so.2 dapl.2.0 "mlx5_1 2" "" ofa-v2-mlx5_0-1m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx5_0 1" "" ofa-v2-mlx5_0-2m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx5_0 2" "" ofa-v2-mlx5_1-1m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx5_1 1" "" ofa-v2-mlx5_1-2m u2.0 nonthreadsafe default libdaplomcm.so.2 dapl.2.0 "mlx5_1 2" ""
Some system information here:
$ slurmd -V slurm 14.03.0 $ mpirun –V Intel(R) MPI Library for Linux* OS, Version 4.1 Update 1 Build 20130522 Copyright (C) 2003-2013, Intel Corporation. All rights reserved. cn117$ ofed_info|head -n1 MLNX_OFED_LINUX-2.2-1.0.1 (OFED-2.2-1.0.0): cn117$ ibv_devinfo hca_id: mlx4_0 transport: InfiniBand (0) fw_ver: 2.11.550 node_guid: sys_image_guid: ########## vendor_id: ########## vendor_part_id: ######## hw_ver: 0x0 board_id: ######## phys_port_cnt: 2 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 1 port_lid: 131 port_lmc: 0x00 link_layer: InfiniBand port: 2 state: PORT_DOWN (1) max_mtu: 4096 (5) active_mtu: 4096 (5) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: InfiniBand cn117$ cat /etc/redhat-release Red Hat Enterprise Linux Workstation release 6.5 (Santiago) cn117$ uname –r 2.6.32-431.23.3.el6.x86_64
I wonder if anyone faced similar issue before and could help us to figure out a solution.
Thanks,
Tingyang Xu
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The error you are getting indicates that the OFA fabric is unavailable on the node before SLURM* is rebooted. I would check the SLURM* restart process and see if it is restarting the OFED* driver. Also, see if you can run a job with I_MPI_FABRICS=shm:tcp before rebooting SLURM*.
When using OFA, dat.conf is not used, so that is not where you need to be looking.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I do not think slurm restarted ofa. I do put the restart in the rc.local. But it still has the issue. Here is the information from boot.log
Welcome to Red Hat Enterprise Linux Workstation Starting udev: [ OK ] Setting hostname cn133: [ OK ] Setting up Logical Volume Management: 3 logical volume(s) in volume group "vg_cn133" now active [ OK ] Checking filesystems /dev/mapper/vg_cn133-lv_root: clean, 83124/3276800 files, 897418/13107200 blocks /dev/sda1: clean, 45/128016 files, 79861/512000 blocks /dev/mapper/vg_cn133-lv_home: clean, 11/14712832 files, 971293/58820608 blocks [ OK ] Remounting root filesystem in read-write mode: [ OK ] Mounting local filesystems: [ OK ] Enabling local filesystem quotas: [ OK ] Enabling /etc/fstab swaps: [ OK ] Entering non-interactive startup Calling the system activity data collector (sadc)... Starting monitoring for VG vg_cn133: 3 logical volume(s) in volume group "vg_cn133" monitored [ OK ] Loading HCA driver and Access Layer: [ OK ] Setting up InfiniBand network interfaces: Bringing up interface ib0: [ OK ] Setting up service network . . . [ done ] Bringing up loopback interface: [ OK ] Bringing up interface em1: Determining IP information for em1... done. [ OK ] Bringing up interface ib0: RTNETLINK answers: File exists [ OK ] Starting MUNGE: munged [ OK ] Starting postfix: [ OK ] Starting abrt daemon: [ OK ] Loading BLCR: FATAL: Module blcr_imports not found. FATAL: Module blcr not found. [ OK ] Starting crond: [ OK ] starting slurmd: [ OK ] Starting atd: [ OK ] Starting Red Hat Network Daemon: [ OK ] Starting rhsmcertd... [ OK ] Starting certmonger: [ OK ] Stopping slurmd: [ OK ] slurmd is stopped Starting slurmd: [ OK ]
The settings I_MPI_FABRICS=shm:tcp works but I hope to use infiniband instead.
$ srun ./hello [18] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [19] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [0] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [1] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [3] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [4] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [5] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [2] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [6] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [7] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [8] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [9] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [10] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [11] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [12] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [13] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [14] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [15] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [16] MPI startup(): ofa fabric is not available and fallback fabric is not enabled [17] MPI startup(): ofa fabric is not available and fallback fabric is not enabled srun: error: cn133: tasks 0-19: Exited with exit code 254 srun: Terminating job step 1675.0 $ export I_MPI_FABRICS=shm:tcp $ srun ./hello This is Process 11 out of 20 running on host cn133 This is Process 6 out of 20 running on host cn133 This is Process 12 out of 20 running on host cn133 This is Process 0 out of 20 running on host cn133 This is Process 3 out of 20 running on host cn133 This is Process 2 out of 20 running on host cn133 This is Process 16 out of 20 running on host cn133 This is Process 18 out of 20 running on host cn133 This is Process 5 out of 20 running on host cn133 This is Process 13 out of 20 running on host cn133 This is Process 17 out of 20 running on host cn133 This is Process 1 out of 20 running on host cn133 This is Process 8 out of 20 running on host cn133 This is Process 19 out of 20 running on host cn133 This is Process 10 out of 20 running on host cn133 This is Process 14 out of 20 running on host cn133 This is Process 9 out of 20 running on host cn133 This is Process 15 out of 20 running on host cn133 This is Process 7 out of 20 running on host cn133 This is Process 4 out of 20 running on host cn133
I apreciate for you suggestion.
Best,
Tingyang Xu
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page