Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

mpd problems

4sissi
Beginner
876 Views

Hi all,

I'm using Intel-MPI 3.2.011 on a cluster with 9 nodes and 36 cpus and a master node with 2 cpus. Ethernet interconnects all nodes.

The mpdboot commands on master:

/opt/intel/impi/3.2.0.011/bin64/mpd --ncpus=2 -e -d &

/opt/intel/impi/3.2.0.011/bin64/mpdboot --rsh=/usr/bin/ssh --totalnum=10 -1 --file=$HOME//machines.LINUX --verbose --ncpus=2 &b

bring out on nodes the daemon:

[root@sissi0 ~]# ps aux | grep mpd
giorgio 3141 0.0 0.1 156704 5632 ? S 11:32 0:00 python /opt/intel/impi/3.2/bin64/mpd.py -h sissi2 -p 40057 --ifhn=10.1.1.10 --ncpus=4 --myhost=sissi0 --myip=10.1.1.10 -e -d -s 10

On master node I got the following:

LAUNCHED mpd on sissi.xxxx.xx via
RUNNING: mpd on sissi.xxxx.xx
LAUNCHED mpd on sissi8 via sissi.xxxx.xx
LAUNCHED mpd on sissi1 via sissi.xxxx.xx
LAUNCHED mpd on sissi2 via sissi.inogs.it
LAUNCHED mpd on sissi3 via sissi.inogs.it
RUNNING: mpd on sissi8
RUNNING: mpd on sissi2
LAUNCHED mpd on sissi0 via sissi8
LAUNCHED mpd on sissi4 via sissi8
LAUNCHED mpd on sissi5 via sissi8
RUNNING: mpd on sissi1
LAUNCHED mpd on sissi6 via sissi8
RUNNING: mpd on sissi3
LAUNCHED mpd on sissi7 via sissi3
RUNNING: mpd on sissi5
RUNNING: mpd on sissi0
RUNNING: mpd on sissi4
RUNNING: mpd on sissi7
RUNNING: mpd on sissi6
mpdboot_sissi.inogs.it (handle_mpd_output 752): from mpd on sissi0, invalid port info:
sissi0: Connection refused

Can someone help me out to resolve this issue?

giorgio

0 Kudos
3 Replies
Andrey_D_Intel
Employee
876 Views

Hi,

It is dificult to determine a reason of such issue without having mpd.log files. I would suggest you submit an issue report at https://premier.intel.com

Best regards,

Andrey

0 Kudos
Mostafa_N_
Beginner
876 Views

Hi,

I have a corrupted MPD on my machine. Whenever I call any of the commands "mpdtrace" or "mpdallexit" I get the following message:

mpdroot: cannot connect to local mpd at: /tmp/mpd2.console_root
probable cause: no mpd daemon on this machine
possible cause: unix socket /tmp/mpd2.console_root has been removed
mpdtrace (__init__ 1524): forked process failed; status=255

I have my mpd running:

mpd& 
ps -ef | grep mpd

Note: I am running on my own machine! Single Node, Quad core (intel i7 core)

Any helps would be appreciated.

0 Kudos
James_T_Intel
Moderator
876 Views

Hi Mostafa,

Please see my reply to your other post at http://software.intel.com/en-us/forums/topic/380080#comment-1730473.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

0 Kudos
Reply