<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: mpdboot gives python error for more than one node in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880026#M1906</link>
    <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/441375"&gt;ictceeval&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;I'm glad to report: problem solved. The other issue was that the slave nodes couldn't talk back to the master node due to missing entries in /etc/hosts and missing ssh-keys. Having fixed that, I am now able to set up and MPI ring.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;Thanks for letting me know, ictceeval. I'm glad things are working for you now. Have fun using the Intel Cluster Tools!&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;</description>
    <pubDate>Wed, 02 Sep 2009 18:25:12 GMT</pubDate>
    <dc:creator>Gergana_S_Intel</dc:creator>
    <dc:date>2009-09-02T18:25:12Z</dc:date>
    <item>
      <title>mpdboot gives python error for more than one node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880022#M1902</link>
      <description>Hi everybody!&lt;BR /&gt;&lt;BR /&gt;I just installed ICTCE on my test machine (1 PC, 2 VMs as nodes). When I try to get an MPI ring up and running, this happens:&lt;BR /&gt;&amp;gt; mpdboot -n 3 -f mpd.hosts&lt;BR /&gt; LAUNCHED mpd on istanbul via &lt;BR /&gt;RUNNING: mpd on istanbul&lt;BR /&gt;LAUNCHED mpd on cnode1 via istanbul&lt;BR /&gt;Traceback (most recent call last):&lt;BR /&gt; File "&lt;STDIN&gt;", line 918, in &lt;MODULE&gt;&lt;BR /&gt; File "&lt;STDIN&gt;", line 669, in mpdboot&lt;BR /&gt; File "&lt;STDIN&gt;", line 758, in launch_one_mpd&lt;BR /&gt; File "/usr/lib/python2.6/subprocess.py", line 595, in __init__&lt;BR /&gt; errread, errwrite)&lt;BR /&gt; File "/usr/lib/python2.6/subprocess.py", line 1106, in _execute_child&lt;BR /&gt; raise child_exception&lt;BR /&gt;OSError: [Errno 2] No such file or directory&lt;BR /&gt;&lt;BR /&gt;where mpd.hosts looks like this:&lt;BR /&gt;istanbul&lt;BR /&gt;cnode1&lt;BR /&gt;cnode2&lt;BR /&gt;&lt;BR /&gt;mpdcheck -f mpd.hosts -v gives&lt;BR /&gt;obtaining hostname via gethostname and getfqdn&lt;BR /&gt;gethostname gives istanbul&lt;BR /&gt;getfqdn gives istanbul.site&lt;BR /&gt;checking out unqualified hostname; make sure is not "localhost", etc.&lt;BR /&gt;checking out qualified hostname; make sure is not "localhost", etc.&lt;BR /&gt;obtain IP addrs via qualified and unqualified hostnames; make sure other than 127.0.0.1&lt;BR /&gt;gethostbyname_ex: ('istanbul.site', ['istanbul'], ['192.168.220.105'])&lt;BR /&gt;gethostbyname_ex: ('istanbul.site', ['istanbul'], ['192.168.220.105'])&lt;BR /&gt;checking that IP addrs resolve to same host&lt;BR /&gt;now do some gethostbyaddr and gethostbyname_ex for machines in hosts file&lt;BR /&gt;checking gethostbyXXX for unqualified istanbul&lt;BR /&gt;gethostbyname_ex: ('istanbul.site', ['istanbul'], ['192.168.220.105'])&lt;BR /&gt;checking gethostbyXXX for qualified istanbul&lt;BR /&gt;gethostbyname_ex: ('istanbul.site', ['istanbul'], ['192.168.220.105'])&lt;BR /&gt;checking gethostbyXXX for unqualified cnode1&lt;BR /&gt;gethostbyname_ex: ('cnode1.site', ['cnode1'], ['192.168.220.118'])&lt;BR /&gt;checking gethostbyXXX for qualified cnode1&lt;BR /&gt;gethostbyname_ex: ('cnode1.site', ['cnode1'], ['192.168.220.118'])&lt;BR /&gt;checking gethostbyXXX for unqualified cnode2&lt;BR /&gt;gethostbyname_ex: ('cnode2.site', ['cnode2'], ['192.168.220.119'])&lt;BR /&gt;checking gethostbyXXX for qualified cnode2&lt;BR /&gt;gethostbyname_ex: ('cnode2.site', ['cnode2'], ['192.168.220.119'])&lt;BR /&gt;obtain IP addrs via localhost name; make sure that it equal to 127.0.0.1&lt;BR /&gt;gethostbyname_ex: ('localhost', ['ipv6-localhost', 'ipv6-loopback'], ['127.0.0.1'])&lt;BR /&gt;&lt;BR /&gt;ssh cnode1 and so on works perfectly well. lamboot mpd.hosts also works, so I'm pretty sure that establishing connections to the other nodes is not the problem.&lt;BR /&gt;&lt;BR /&gt;Any ideas?&lt;BR /&gt;&lt;BR /&gt;Thanks in advance.&lt;BR /&gt;&lt;/STDIN&gt;&lt;/STDIN&gt;&lt;/MODULE&gt;&lt;/STDIN&gt;</description>
      <pubDate>Wed, 02 Sep 2009 16:57:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880022#M1902</guid>
      <dc:creator>kern</dc:creator>
      <dc:date>2009-09-02T16:57:50Z</dc:date>
    </item>
    <item>
      <title>Re: mpdboot gives python error for more than one node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880023#M1903</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/441375"&gt;ictceeval&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;ssh cnode1 and so on works perfectly well. lamboot mpd.hosts also works, so I'm pretty sure that establishing connections to the other nodes is not the problem.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;Hi ictceeval,&lt;/P&gt;
&lt;P&gt;Thanks for posting. Since you're using ssh for remote shell access, you need to specify this on the mpdboot command line:
&lt;/P&gt;&lt;BLOCKQUOTE&gt;$ mpdboot &lt;STRONG&gt;-r ssh&lt;/STRONG&gt; -n 3 -f mpd.hosts&lt;/BLOCKQUOTE&gt;
The default for the Intel MPI Library is rsh.&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Let us know how it goes.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;</description>
      <pubDate>Wed, 02 Sep 2009 17:05:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880023#M1903</guid>
      <dc:creator>Gergana_S_Intel</dc:creator>
      <dc:date>2009-09-02T17:05:35Z</dc:date>
    </item>
    <item>
      <title>Re: mpdboot gives python error for more than one node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880024#M1904</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/198675"&gt;Gergana Slavova (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;P&gt;Hi ictceeval,&lt;/P&gt;
&lt;P&gt;Thanks for posting. Since you're using ssh for remote shell access, you need to specify this on the mpdboot command line:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;$ mpdboot &lt;STRONG&gt;-r ssh&lt;/STRONG&gt; -n 3 -f mpd.hosts&lt;/BLOCKQUOTE&gt;
The default for the Intel MPI Library is rsh.
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Let us know how it goes.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Thanks for your quick help! I didn't know that. Unfortunately this seems to lead to another issue:&lt;BR /&gt;&lt;BR /&gt;&amp;gt; mpdboot -n 3 -f mpd.hosts -r ssh -v&lt;BR /&gt;running mpdallexit on istanbul&lt;BR /&gt;LAUNCHED mpd on istanbul via &lt;BR /&gt;RUNNING: mpd on istanbul&lt;BR /&gt;LAUNCHED mpd on cnode1 via istanbul&lt;BR /&gt;LAUNCHED mpd on cnode2 via istanbul&lt;BR /&gt;mpdboot_istanbul (handle_mpd_output 828): Failed to establish a socket connection with cnode1:41650 : [Errno 111] Connection refused&lt;BR /&gt;mpdboot_istanbul (handle_mpd_output 845): failed to connect to mpd on cnode1&lt;BR /&gt;&lt;BR /&gt;How do I interpret that output? It says "LAUNCHED mpd on cnode1" and then again "Failed to establish...."?!&lt;BR /&gt;&lt;BR /&gt;UPDATE:&lt;BR /&gt;Somehow, things seem to get out of hand. Now, I'm getting this message:&lt;BR /&gt;&amp;gt; mpdboot -n 3 -f mpd.hosts -r ssh -v --chkup&lt;BR /&gt;checking cnode1&lt;BR /&gt;checking cnode2&lt;BR /&gt;there are 3 hosts up (counting local)&lt;BR /&gt;running mpdallexit on istanbul&lt;BR /&gt;LAUNCHED mpd on istanbul via &lt;BR /&gt;RUNNING: mpd on istanbul&lt;BR /&gt;LAUNCHED mpd on cnode1 via istanbul&lt;BR /&gt;LAUNCHED mpd on cnode2 via istanbul&lt;BR /&gt;mpdboot_istanbul (handle_mpd_output 837): failed to ping mpd on cnode1; received output={}&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Sep 2009 17:15:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880024#M1904</guid>
      <dc:creator>kern</dc:creator>
      <dc:date>2009-09-02T17:15:31Z</dc:date>
    </item>
    <item>
      <title>Re: mpdboot gives python error for more than one node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880025#M1905</link>
      <description>Hi Gergana!&lt;BR /&gt;&lt;BR /&gt;I'm glad to report: problem solved. The other issue was that the slave nodes couldn't talk back to the master node due to missing entries in /etc/hosts and missing ssh-keys. Having fixed that, I am now able to set up and MPI ring.&lt;BR /&gt;&lt;BR /&gt;Thank you very much for your help!&lt;BR /&gt;&lt;BR /&gt;ictceeval&lt;BR /&gt;</description>
      <pubDate>Wed, 02 Sep 2009 18:09:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880025#M1905</guid>
      <dc:creator>kern</dc:creator>
      <dc:date>2009-09-02T18:09:10Z</dc:date>
    </item>
    <item>
      <title>Re: mpdboot gives python error for more than one node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880026#M1906</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/441375"&gt;ictceeval&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;I'm glad to report: problem solved. The other issue was that the slave nodes couldn't talk back to the master node due to missing entries in /etc/hosts and missing ssh-keys. Having fixed that, I am now able to set up and MPI ring.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;Thanks for letting me know, ictceeval. I'm glad things are working for you now. Have fun using the Intel Cluster Tools!&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;</description>
      <pubDate>Wed, 02 Sep 2009 18:25:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpdboot-gives-python-error-for-more-than-one-node/m-p/880026#M1906</guid>
      <dc:creator>Gergana_S_Intel</dc:creator>
      <dc:date>2009-09-02T18:25:12Z</dc:date>
    </item>
  </channel>
</rss>

