When I want to mpdboot 2 hosts with debug mode:
mpdboot -v -d -r ssh -n 2 -f ./mpd.conf
the error message shows:
debug: starting
totalnum=2 numhosts=1
there are not enough hosts on which to start all processes
mpd.conf
node001:8
node002:8
And the python version in both nodes is 2.4.3.
How can I solve it?
链接已复制
Hi Xiaoming,
The issue here is that the Intel MPI Library seems to think your ./mpd.conf file only contains a single machine name. Can you verify if that's true? Sometimes that happens if your hosts file has strange EOF symbols (for example, when copied from Microsoft to a Unix machine), or one of the lines is commented out inadvertanly, etc. Are you starting the mpdboot command from node001 or node002 or some other machine?
Also, what does your /etc/hosts file look like?
Regards,
~Gergana
The mpd.conf file is edited on Linux machine instead of copying. No strange symbol is included. I start the mpdboot command from node001.
For /etc/hosts, no message is about node001 or node002 although I can freely ssh to any calculation node. I am not the administrator, so I do not how it works. However, I know the node001 message would be as follow if it existed in /etc/hosts,
10.141.0.1 node001.cm.cluster node001
Hi Xiaoming,
The issue here is that the Intel MPI Library seems to think your ./mpd.conf file only contains a single machine name. Can you verify if that's true? Sometimes that happens if your hosts file has strange EOF symbols (for example, when copied from Microsoft to a Unix machine), or one of the lines is commented out inadvertanly, etc. Are you starting the mpdboot command from node001 or node002 or some other machine?
Also, what does your /etc/hosts file look like?
Regards,
~Gergana
