- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello everyone,
When I try to submit a multi-node LAMMPS job on PBS, it launches separate processes (4 independant LAMMPS instance) instead of launching parallel MPI processes. Can you please let us know your insights on that issue?
Job:
====
#PBS -N 5-5-08
#PBS -l select=2:ncpus=2:mpiprocs=2
#PBS -l walltime=1:00:00
#PBS -l place=scatter:excl
#PBS -j oe
cd $PBS_O_WORKDIR
module load intelmpi-2026.0.0/mpi/2021.18
module load lammps-2025.0.7-intel
echo "Nodefile:"
cat $PBS_NODEFILE
NPROCS=$(wc -l < $PBS_NODEFILE)
time mpirun lmp_mpi -in in1.lj
Simplified Output:
==============
Nodefile:
xx-cn-001
xx-cn-001
xx-002
xx-002
No PMIx server was reachable, but a PMI1/2 was detected.
If srun is being used to launch application, 4 singletons will be started.
> The output get replicated four times
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes I know that. This is the output of the job, I am not using srun as shown in the job but I am facing this issue that it will start 4 single separate processes not MPI processes.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
ok, other question, why is srun even available on a system that is using PBS?
in any way, please provide the output of
I_MPI_DEBUG=10 I_MPI_HYDRA_DEBUG=1 mpirun IMB-MPI1 pingpong
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page