Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Kristian_G_
Beginner
102 Views

MPI error when submitting a Job on Sun Grid Engine

I currently have a problem when trying to run mpiexec or mpiexec-hydra on a cluster using Sun Grid Engine to schedule jobs.
 
The errors that come up after running the mpiexec-hydra are the following:
 
Traceback (most recent call last):
  File "<stdin>", line 973, in ?
  File "<stdin>", line 465, in mpdboot
ValueError: need more than 1 value to unpack
error: commlib error: access denied (client IP resolved to host name "localhost.localdomain". This is not identical to clients host name "node038.cm.cluster")
error: executing task of job 1046000 failed: failed sending task to execd@localhost: can't find connection
 
 
 
And this is a job-script that I am using to run the mpi Job:
 
#!/bin/sh
#
# Your job name
#$ -N My_Job
#
# Use current working directory
#$ -cwd
#
# pe (Parallel environment) request. Set your number of processors here.
#$ -pe impi 24
#
# Run job through bash shell
#$ -S /bin/bash
 
# If modules are needed, source modules environment:
. /etc/profile.d/modules.sh
 
# Add any modules you might require:
module add shared
module load intel/compiler/64/11.1/046
module load intel/mpi/4.0.0.028
 
# The following output will show in the output file
echo "Got $NSLOTS processors."
 
cat $PE_HOSTFILE | awk '{print $1}' | sort -u | head -n 2 > hostfile.txt
 
# Run your application
 
env
export I_MPI_MPD_RSH=ssh
export I_MPI_HYDRA_DEBUG=on
export I_MPI_HYDRA_BOOTSTRAP=ssh
 
mpdboot  -n 2 --verbose -r /usr/bin/ssh -f $PE_HOSTFILE
 
mpiexec.hydra -np 2 pingpong2  > output_file.txt
 
 
 
Also I did not setup an .mpd.conf and mpd.hosts in my home account. Not sure if these files are generated when calling intel mpi.
 
When i run ps aux | mpd, no mpd is listed as running.
 
 
Thanks,
Kris
0 Kudos
3 Replies
TimP
Black Belt
102 Views

Question on Intel MPI are more likely to get answers on the companion Cluster and HPC forum.

Kittur_G_Intel
Employee
102 Views

I agree Tim.
@krigri:  I'll transfer this thread to the clusters/HPC forum at: https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology for faster resolution.

_Kittur

Artem_R_Intel1
Employee
102 Views

Hi Kris,

As far as I see you use quite old Intel MPI Library 4.0. Is it possible for you to try latest (5.x) version?

BTW MPD related commands aren't necessary if you use mpiexec.hydra (Hydra process launcher). MPD and Hydra are two different MPI process launchers.

Reply