Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

mpdboot is not working

Vijay_Amirtharaj_A
1,059 Views

Hi,

We are installing new cluster.i installed composer_xe_2011_sp1.6.233.

While checking mpdboot.It's failing to lainch mpdboot.

I am usint this command to launch mpdboot. mpdboot -n 2 -r ssh -f ~/mpd.hosts

Error message coming like this.

mpdboot -n 2 -r ssh -f ~/mpd.hosts
mpdboot_taavare.tuecms.com (handle_mpd_output 958): failed to ping mpd on n0.tuecms.com; received output={}
mpdboot_taavare.tuecms.com (handle_mpd_output 960): Please examine the /tmp/mpd2.logfile_bala log file on each node of the ring

[bala@taavare ~]$ cat /tmp/mpd2.logfile_bala
logfile for mpd with pid 12206
taavare.tuecms.com_40561 (handle_rhs_input 2864): connection with the right neighboring mpd daemon was lost; attempting to re-enter the mpd ring
taavare.tuecms.com_40561 (reenter_ring 1072): reenter_ring returned 0 after 1 tries
taavare.tuecms.com_40561 (handle_rhs_input 2871): the daemon successfully reentered the mpd ring

what is the problem.please guide us

Regards,

Vijay Amirtharaj A

0 Kudos
3 Replies
James_T_Intel
Moderator
1,059 Views

Hi Vijay,

Check your /etc/hosts file and make certain that the IP addresses of the hosts in mpd.hosts are correct.

Also, I would suggest using Hydra instead of MPD.  There are several ways to use Hydra.  As of Version 4.0 Update 3 of the Intel® MPI Library, using mpirun will default to Hydra rather than MPD.  You can set I_MPI_PROCESS_MANAGER=hydra to specify Hydra for versions after Version 4.0 Update 1.  Or you can explicitly call mpiexec.hydra instead of mpiexec.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
Mostafa_N_
Beginner
1,059 Views

Hi,

I have a corrupted MPD on my machine. Whenever I call any of the commands "mpdtrace" or "mpdallexit" I get the following message:

mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root
probable cause: no mpd daemon on this machine
possible cause: unix socket /tmp/mpd2.console_root has been removed
mpdtrace (__init__ 1524): forked process failed; status=255

I have my mpd running:

mpd& 
ps -ef | grep mpd

Note: I am running on my own machine! Single Node, Quad core (intel i7 core)

Any helps would be appreciated.

0 Kudos
James_T_Intel
Moderator
1,059 Views

Hi Mostafa,

What version of the Intel® MPI Library are you using?  If you are using at least version 4.0, I would recommend avoiding using MPDs and instead use Hydra.  As of Version 4.0 Update 2, using mpirun will default to Hydra.  With Hydra, you can simply launch your job, without needing additional daemons running beforehand.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
Reply