- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
All,
(Note: I'm also asking this on the slurm-dev list.)
I'm hoping you can help me with a question. Namely, I'm on a cluster that uses SLURM and lets say I ask for 2 28-core Haswell nodes to run interactively and I get them. Great, so my environment now has things like:
SLURM_NTASKS_PER_NODE=28 SLURM_TASKS_PER_NODE=28(x2) SLURM_JOB_CPUS_PER_NODE=28(x2) SLURM_CPUS_ON_NODE=28
Now, let's run a simple HelloWorld on, say, 48 processors (and pipe through sort to see things a bit better):
(1047) $ mpirun -np 48 -print-rank-map ./helloWorld.exe | sort -k2 -g srun.slurm: cluster configuration lacks support for cpu binding (borgj102:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27) (borgj105:28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) Process 0 of 48 is on borgj102 Process 1 of 48 is on borgj102 Process 2 of 48 is on borgj102 Process 3 of 48 is on borgj102 Process 4 of 48 is on borgj102 Process 5 of 48 is on borgj102 Process 6 of 48 is on borgj102 Process 7 of 48 is on borgj102 Process 8 of 48 is on borgj102 Process 9 of 48 is on borgj102 Process 10 of 48 is on borgj102 Process 11 of 48 is on borgj102 Process 12 of 48 is on borgj102 Process 13 of 48 is on borgj102 Process 14 of 48 is on borgj102 Process 15 of 48 is on borgj102 Process 16 of 48 is on borgj102 Process 17 of 48 is on borgj102 Process 18 of 48 is on borgj102 Process 19 of 48 is on borgj102 Process 20 of 48 is on borgj102 Process 21 of 48 is on borgj102 Process 22 of 48 is on borgj102 Process 23 of 48 is on borgj102 Process 24 of 48 is on borgj102 Process 25 of 48 is on borgj102 Process 26 of 48 is on borgj102 Process 27 of 48 is on borgj102 Process 28 of 48 is on borgj105 Process 29 of 48 is on borgj105 Process 30 of 48 is on borgj105 Process 31 of 48 is on borgj105 Process 32 of 48 is on borgj105 Process 33 of 48 is on borgj105 Process 34 of 48 is on borgj105 Process 35 of 48 is on borgj105 Process 36 of 48 is on borgj105 Process 37 of 48 is on borgj105 Process 38 of 48 is on borgj105 Process 39 of 48 is on borgj105 Process 40 of 48 is on borgj105 Process 41 of 48 is on borgj105 Process 42 of 48 is on borgj105 Process 43 of 48 is on borgj105 Process 44 of 48 is on borgj105 Process 45 of 48 is on borgj105 Process 46 of 48 is on borgj105 Process 47 of 48 is on borgj105
As you can see, the first 28 processes are on node 1, and the last 20 are on node 2. Okay. Now, I want to do some load balancing, so I want 24 on each. In the past, I always used -perhost and it worked, but now:
(1048) $ mpirun -np 48 -perhost 24 -print-rank-map ./helloWorld.exe | sort -k2 -g srun.slurm: cluster configuration lacks support for cpu binding (borgj102:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27) (borgj105:28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) Process 0 of 48 is on borgj102 Process 1 of 48 is on borgj102 Process 2 of 48 is on borgj102 Process 3 of 48 is on borgj102 Process 4 of 48 is on borgj102 Process 5 of 48 is on borgj102 Process 6 of 48 is on borgj102 Process 7 of 48 is on borgj102 Process 8 of 48 is on borgj102 Process 9 of 48 is on borgj102 Process 10 of 48 is on borgj102 Process 11 of 48 is on borgj102 Process 12 of 48 is on borgj102 Process 13 of 48 is on borgj102 Process 14 of 48 is on borgj102 Process 15 of 48 is on borgj102 Process 16 of 48 is on borgj102 Process 17 of 48 is on borgj102 Process 18 of 48 is on borgj102 Process 19 of 48 is on borgj102 Process 20 of 48 is on borgj102 Process 21 of 48 is on borgj102 Process 22 of 48 is on borgj102 Process 23 of 48 is on borgj102 Process 24 of 48 is on borgj102 Process 25 of 48 is on borgj102 Process 26 of 48 is on borgj102 Process 27 of 48 is on borgj102 Process 28 of 48 is on borgj105 Process 29 of 48 is on borgj105 Process 30 of 48 is on borgj105 Process 31 of 48 is on borgj105 Process 32 of 48 is on borgj105 Process 33 of 48 is on borgj105 Process 34 of 48 is on borgj105 Process 35 of 48 is on borgj105 Process 36 of 48 is on borgj105 Process 37 of 48 is on borgj105 Process 38 of 48 is on borgj105 Process 39 of 48 is on borgj105 Process 40 of 48 is on borgj105 Process 41 of 48 is on borgj105 Process 42 of 48 is on borgj105 Process 43 of 48 is on borgj105 Process 44 of 48 is on borgj105 Process 45 of 48 is on borgj105 Process 46 of 48 is on borgj105 Process 47 of 48 is on borgj105
Huh. No change and still 28,20. Do you know if there is a way to "override" what appears to be SLURM beating the -perhost flag? I suppose there is that srun.slurm warning being thrown, but that usually is a warning for more "tasks-per-core" sort of manipulations.
Thanks,
Matt
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Oh, and since I forgot, I'm running Intel MPI 5.0.3.048. Sorry!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Addendum: Per an admin here at NASA on the SLURM List:
I'm pretty confident in saying this is entirely in Intel MPI land: aknister@borgj157:~> I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=enable mpiexec.hydra -np 48 -ppn 24 -print-rank-map /bin/true (borgj157:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27) (borgj164:28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) aknister@borgj157:~> I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=disable mpiexec.hydra -np 48 -ppn 24 -print-rank-map /bin/true (borgj157:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23) (borgj164:24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) However, if a machinefile argument is passed to mpiexec.hydra (which mpirun does by default) the I_MPI_JOB_RESPECT_PROCESS_PLACEMENT variable isn't respected (see below). Maybe we need an I_MPI_JOB_RESPECT_I_MPI_JOB_RESPECT_PROCESS_PLACEMENT_VARIABLE variable. aknister@borgj157:~> I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=enable mpiexec.hydra -machinefile $PBS_NODEFILE -np 48 -ppn 24 --print-rank-map true (borgj157:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27) (borgj164:28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47) aknister@borgj157:~> I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=disable mpiexec.hydra -machinefile $PBS_NODEFILE -np 48 -ppn 24 --print-rank-map true (borgj157:0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27) (borgj164:28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47)
Does anyone here at Intel know how to get mpirun to respect this so -ppn can work with SLURM?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Overriding works with Intel MPI 5.1.3.181
I just tried this with Intel MPI 5.1.3.181. It seems, "I_MPI_JOB_RESPECT_PROCESS_PLACEMENT=disable" is no longer ignored . When this variable is set, SLURM process placement is overwritten by "-ppn" or "-perhost".
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page