- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi!
I'm trying to run a software called LAMMPS, across nodes. As recommended in it's page I'm using 2 OpenMP threads and enough MPI processes to fill all cores:
https://lammps.sandia.gov/doc/Speed_intel.html
This is fine if I'm running it on one node. When I use, say 4 nodes, the process uses only 2 nodes.
How do I distribute it across nodes? This is what I've tried:
mpirun -machinefile $PBS_NODEFILE -n 64 -ppn 16 \
-genv OMP_NUM_THREADS=2 -genv I_MPI_PIN_DOMAIN=omp \
lmp -in in.lammps -suffix hybrid intel omp -package intel 0 omp 2
There are 32 cores per node, and so I'm trying to assign 16 MPI processes per node, so each may spawn 2 OMP threads. And `lmp` is the LAMMPS executable.
What am I doing wrong?
Link Copied
2 Replies
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Vishnu,
-machinefile option should not be used together with -n -ppn options. Please replace -machinefile to -f in your command line and try again.
There are 2 ways to set process placement across the nodes:
1. using -machinefile, where you specify the <node_name>:<num_processes> in each line of machine file
or
2. -f <hostfile> which contains a list of nodes. And using -n you specify a total number of MPI processes and -ppn to specify a number of processes per node.
--
Best regards, Yury
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Yury!
Before you replied, I tried this, and it works for me, even with the machinefile:
#!/bin/bash #PBS -l select=4:ncpus=32:mpiprocs=16 #PBS -N bench #PBS -q cpuq NODES=4 cd $PBS_O_WORKDIR mpirun -machinefile $PBS_NODEFILE -n $((16*${NODES})) -ppn 16 \ -genv OMP_NUM_THREADS=2 -genv I_MPI_PIN_DOMAIN=omp \ lmp -in in.lammps -suffix intel -package intel 0 omp 2
The primary difference being that earlier, I was using the following PBS line to requisition nodes:
#PBS -l nodes=4:ppn=32
The application now scales well across nodes.
Reply
Topic Options
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page