<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic I can't run program on slave node in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899683#M2178</link>
    <description>I can't run program on slave node.&lt;BR /&gt;&lt;BR /&gt;execute "mpirun -f ./mpd.hosts -np 2 ./testcpp"&lt;BR /&gt;=====================================================&lt;BR /&gt;&lt;SPAN class="sectionHeadingText"&gt;Hello world: rank 0 of 2 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 2 running on cluster-master&lt;/SPAN&gt;&lt;BR /&gt;=====================================================&lt;BR /&gt;It is just run on master&lt;BR /&gt;&lt;BR /&gt;execute "sshconnectivity.exp machines.LINUX"&lt;BR /&gt;=====================================================&lt;BR /&gt;Node count = 2&lt;BR /&gt;Secure shell connectivity was established on all nodes.&lt;BR /&gt;See the log output listing "/tmp/sshconnectivity.user.log" for details.&lt;BR /&gt;Version number: $Revision: 1.18 $&lt;BR /&gt;Version date: $Date: 2008/10/19 04:06:21 $&lt;BR /&gt;=====================================================&lt;BR /&gt;&lt;BR /&gt;the content of mpd.hosts &amp;amp; machines.LINUX is &lt;BR /&gt;=====================================================&lt;BR /&gt;cluster-master&lt;BR /&gt;cluster-slave1&lt;BR /&gt;=====================================================&lt;BR /&gt;and save in /home/user at master &amp;amp; slave.&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="sectionHeadingText"&gt;but have a problem happen when I execute "mpdboot -f ./mpd.hosts  -n 2"&lt;BR /&gt;=====================================================&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 739): failed to ping mpd on cluster-slave1; received output={}&lt;BR /&gt;=====================================================&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;I can use ssh login master from slave without password, also can use ssh login slave from master without password,&lt;BR /&gt;and I already close firewall.&lt;BR /&gt;&lt;BR /&gt;Please Help me....   Thanks...</description>
    <pubDate>Thu, 21 May 2009 00:43:08 GMT</pubDate>
    <dc:creator>camiyu917gmail_com</dc:creator>
    <dc:date>2009-05-21T00:43:08Z</dc:date>
    <item>
      <title>I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899683#M2178</link>
      <description>I can't run program on slave node.&lt;BR /&gt;&lt;BR /&gt;execute "mpirun -f ./mpd.hosts -np 2 ./testcpp"&lt;BR /&gt;=====================================================&lt;BR /&gt;&lt;SPAN class="sectionHeadingText"&gt;Hello world: rank 0 of 2 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 2 running on cluster-master&lt;/SPAN&gt;&lt;BR /&gt;=====================================================&lt;BR /&gt;It is just run on master&lt;BR /&gt;&lt;BR /&gt;execute "sshconnectivity.exp machines.LINUX"&lt;BR /&gt;=====================================================&lt;BR /&gt;Node count = 2&lt;BR /&gt;Secure shell connectivity was established on all nodes.&lt;BR /&gt;See the log output listing "/tmp/sshconnectivity.user.log" for details.&lt;BR /&gt;Version number: $Revision: 1.18 $&lt;BR /&gt;Version date: $Date: 2008/10/19 04:06:21 $&lt;BR /&gt;=====================================================&lt;BR /&gt;&lt;BR /&gt;the content of mpd.hosts &amp;amp; machines.LINUX is &lt;BR /&gt;=====================================================&lt;BR /&gt;cluster-master&lt;BR /&gt;cluster-slave1&lt;BR /&gt;=====================================================&lt;BR /&gt;and save in /home/user at master &amp;amp; slave.&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN class="sectionHeadingText"&gt;but have a problem happen when I execute "mpdboot -f ./mpd.hosts  -n 2"&lt;BR /&gt;=====================================================&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 739): failed to ping mpd on cluster-slave1; received output={}&lt;BR /&gt;=====================================================&lt;/SPAN&gt;&lt;BR /&gt;&lt;BR /&gt;I can use ssh login master from slave without password, also can use ssh login slave from master without password,&lt;BR /&gt;and I already close firewall.&lt;BR /&gt;&lt;BR /&gt;Please Help me....   Thanks...</description>
      <pubDate>Thu, 21 May 2009 00:43:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899683#M2178</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-05-21T00:43:08Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899684#M2179</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/428759"&gt;camiyu917gmail.com&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;I can't run program on slave node.&lt;BR /&gt;&lt;BR /&gt;execute "mpirun -f ./mpd.hosts -np 2 ./testcpp"&lt;BR /&gt;&lt;SPAN class="sectionHeadingText"&gt;but have a problem happen when I execute "mpdboot -f ./mpd.hosts -n 2"&lt;BR /&gt;&lt;/SPAN&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Hello Camiyu917,&lt;BR /&gt;&lt;BR /&gt;Seems you have configured your cluster for using ssh, so I think that you cantry to add "-r ssh" to both commands?&lt;BR /&gt;If you need to start 1 process per host you can set I_MPI_PERHOST to 1 and checkit launching "mpirun -n 2..." - it should start 2 process on 2 different nodes.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
      <pubDate>Thu, 21 May 2009 08:45:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899684#M2179</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-21T08:45:23Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899685#M2180</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/423452"&gt;Dmitry Kuzmin (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; &lt;BR /&gt;Hello Camiyu917,&lt;BR /&gt;&lt;BR /&gt;Seems you have configured your cluster for using ssh, so I think that you cantry to add "-r ssh" to both commands?&lt;BR /&gt;If you need to start 1 process per host you can set I_MPI_PERHOST to 1 and checkit launching "mpirun -n 2..." - it should start 2 process on 2 different nodes.&lt;BR /&gt;&lt;BR /&gt;Regards!&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Dmitry, thanks for your replication.&lt;BR /&gt;&lt;BR /&gt;I met a another problem.&lt;BR /&gt;&lt;BR /&gt;I try to execute "&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdboot -n 2 -f ./mpd.hosts -r ssh&lt;/STRONG&gt;&lt;/SPAN&gt;", but have this problem.&lt;BR /&gt;=============================================================&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 730): Failed to establish a socket connection with cluster-slave1:33736 : (111, 'Connection refused')&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 747): failed to connect to mpd on cluster-slave1&lt;BR /&gt;=============================================================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;After "&lt;STRONG&gt;&lt;SPAN style="font-size: small;"&gt;export I_MPI_PERHOST=1&lt;/SPAN&gt;&lt;/STRONG&gt;", I execute "&lt;STRONG&gt;&lt;SPAN style="font-size: small;"&gt;mpirun -n 2 -f ./mpd.hosts -r ssh ./testcpp&lt;/SPAN&gt;&lt;/STRONG&gt;".&lt;BR /&gt;I get this problem&lt;BR /&gt;=============================================================&lt;BR /&gt;mpiexec_cluster-master (mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;=============================================================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks for your help ~ ^__^&lt;BR /&gt;</description>
      <pubDate>Thu, 21 May 2009 10:35:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899685#M2180</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-05-21T10:35:19Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899686#M2181</link>
      <description>&lt;DIV style="margin:0px;"&gt;Camiyu917, could you send /tmp/mpd2.logfile_user files? This is very strange error.&lt;BR /&gt;&lt;BR /&gt;Best wishes!&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Thu, 21 May 2009 13:37:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899686#M2181</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-21T13:37:11Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899687#M2182</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;I am execute "&lt;SPAN style="color: #0000ff;"&gt;&lt;STRONG&gt;&lt;SPAN style="font-size: small;"&gt;export I_MPI_PERHOST=1&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;" and "&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;&lt;SPAN style="color: #0000ff;"&gt;mpirun -n 2 -f ./mpd.hosts -r ssh ./testcpp&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;/SPAN&gt;".&lt;BR /&gt;&lt;BR /&gt;============== mpd2.logfile_user_090522.055102_15438 ================&lt;BR /&gt;logfile for mpd with pid 15491&lt;BR /&gt;cluster-master_37718 (handle_rhs_input 2145): connection with the right neighboring mpd daemon was lost; attempting to re-enter the mpd ring&lt;BR /&gt;cluster-master_37718 (reenter_ring 691): reenter_ring returned 0 after 1 tries&lt;BR /&gt;cluster-master_37718 (handle_rhs_input 2152): the daemon successfully reentered the mpd ring&lt;BR /&gt;=========================================================&lt;BR /&gt;&lt;BR /&gt;thanks you~&lt;BR /&gt;</description>
      <pubDate>Thu, 21 May 2009 21:55:08 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899687#M2182</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-05-21T21:55:08Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899688#M2183</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi Camiyu917,&lt;BR /&gt;&lt;BR /&gt;Could you try to run `mpdboot -r ssh -f mpd.hosts -n 2 --chkuponly`&lt;BR /&gt;If it doesn't work it means that ssh doesn't work properly.&lt;BR /&gt;&lt;BR /&gt;And try please the following commands:&lt;BR /&gt;`mpdboot -r ssh -f mpd.hosts -n 2`
&lt;P&gt;`mpdtrace`&lt;/P&gt;
&lt;P&gt;`mpiexec -genv I_MPI_PERHOST 1 -n 2 hostname`&lt;/P&gt;
&lt;P&gt;`mpiexec -genv I_MPI_PERHOST 1 -n 2 ./testcpp`&lt;/P&gt;
&lt;P&gt;`mpdallexit`&lt;/P&gt;
&lt;P&gt;`mpirun -r ssh -f ./mpd.hosts -genv I_MPI_PERHOST 1 -n 2 ./testcpp`&lt;/P&gt;
&lt;BR /&gt;Let me know the output we've seen.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;And please sendme /etc/hosts as well.&lt;BR /&gt;&lt;BR /&gt;Best wishes.&lt;BR /&gt; Dmitry&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Mon, 25 May 2009 10:59:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899688#M2183</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-25T10:59:19Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899689#M2184</link>
      <description>&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="color: #000000;"&gt;Hello Dmitry:&lt;BR /&gt;&lt;BR /&gt;&lt;/SPAN&gt;mpdboot -r ssh -f mpd.hosts -n 2 --chkuponly&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;checking cluster-slave1&lt;BR /&gt;there are 2 hosts up (counting local)&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;mpdboot -r ssh -f mpd.hosts -n 2&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 730): Failed to establish a socket connection with cluster-slave1:33674 : (111, 'Connection refused')&lt;BR /&gt;mpdboot_cluster-master (handle_mpd_output 747): failed to connect to mpd on cluster-slave1&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;/etc/hosts&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;127.0.0.1       localhost&lt;BR /&gt;192.168.2.150   cluster-master cluster-master&lt;BR /&gt;192.168.2.151   cluster-slave1  cluster-slave1&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;thanks for your help.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 26 May 2009 00:33:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899689#M2184</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-05-26T00:33:45Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899690#M2185</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi Camiyu917,&lt;BR /&gt;&lt;BR /&gt;Please login to both servers and check that there is no running mpd.py. Execute 'killall -9 mpd.py' to be sure. (Existing mpd.py processes can prevent a ring creation).&lt;BR /&gt;&lt;BR /&gt;Start 'mpdboot -r ssh -f mpd.hosts -n 2 --debug' (and let me know the output).&lt;BR /&gt;&lt;BR /&gt;Port on your machines (33674) was closed somehow. Might be this is firewall or some other settings. Could you switch off your firewall for short period of time just to check mpi commands?&lt;BR /&gt;&lt;BR /&gt;Best wishes!&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Tue, 26 May 2009 06:13:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899690#M2185</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-26T06:13:46Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899691#M2186</link>
      <description>Hello Dmitry:&lt;BR /&gt;&lt;BR /&gt;I hava close firewall and check no mpd.py process on master and slave1.&lt;BR /&gt;&lt;BR /&gt;Then I execute "&lt;SPAN style="color: #0000ff;"&gt;mpdboot -r ssh -f mpd.hosts -n 2 --debug&lt;SPAN style="color: #000000;"&gt;"&lt;/SPAN&gt;&lt;/SPAN&gt; on master and slave1.&lt;BR /&gt;&lt;BR /&gt;=================== execute on cluster-master =========================&lt;BR /&gt;debug: starting&lt;BR /&gt;running mpdallexit on cluster-master&lt;BR /&gt;debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /opt/intel/impi/3.2.0.011/bin64/mpd.py   --ncpus=1 --myhost=cluster-master -e -d -s 2&lt;BR /&gt;debug: mpd on &lt;SPAN style="color: #0000ff;"&gt;cluster-master  on port 45068&lt;/SPAN&gt;&lt;BR /&gt;debug: info for running mpd: {'ip': '192.168.2.150', 'ncpus': 1, 'list_port': 45068, 'entry_port': '', 'host': 'cluster-master', 'entry_host': '', 'ifhn': ''}&lt;BR /&gt;debug: launch cmd= ssh -x -n cluster-slave1 'env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME HOSTTYPE=$HOSTTYPE MACHTYPE=$MACHTYPE OSTYPE=$OSTYPE /opt/intel/impi/3.2.0.011/bin64/mpd.py  -h cluster-master -p 45068 --ifhn=192.168.2.151 --ncpus=1 --myhost=cluster-slave1 --myip=192.168.2.151 -e -d -s 2'&lt;BR /&gt;debug: mpd on &lt;SPAN style="color: #0000ff;"&gt;cluster-slave1  on port 55976&lt;/SPAN&gt;&lt;BR /&gt;debug: info for running mpd: {'ip': '192.168.2.151', 'ncpus': 1, 'list_port': 55976, 'entry_port': 45068, 'host': 'cluster-slave1', 'entry_host': 'cluster-master', 'ifhn': '', 'pid': 11783}&lt;BR /&gt;==============================================================&lt;BR /&gt;&lt;BR /&gt;=================== execute on cluster-slave1 =========================&lt;BR /&gt;debug: starting&lt;BR /&gt;running mpdallexit on cluster-slave1&lt;BR /&gt;debug: launch cmd= env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 /opt/intel/impi/3.2.0.011/bin64/mpd.py --ncpus=1 --myhost=cluster-slave1 -e -d -s 2&lt;BR /&gt;debug: mpd on cluster-slave1 on port 60362&lt;BR /&gt;debug: info for running mpd: {'ip': '192.168.2.151', 'ncpus': 1, 'list_port': 60362, 'entry_port': '', 'host': 'cluster-slave1', 'entry_host': '', 'ifhn': ''}&lt;BR /&gt;debug: launch cmd= ssh -x -n cluster-master 'env I_MPI_JOB_TAGGED_PORT_OUTPUT=1 HOSTNAME=$HOSTNAME HOSTTYPE=$HOSTTYPE MACHTYPE=$MACHTYPE OSTYPE=$OSTYPE /opt/intel/impi/3.2.0.011/bin64/mpd.py -h cluster-slave1 -p 60362 --ifhn=192.168.2.150 --ncpus=1 --myhost=cluster-master --myip=192.168.2.150 -e -d -s 2'&lt;BR /&gt;debug: mpd on cluster-master on port 42052&lt;BR /&gt;debug: info for running mpd: {'ip': '192.168.2.150', 'ncpus': 1, 'list_port': 42052, 'entry_port': 60362, 'host': 'cluster-master', 'entry_host': 'cluster-slave1', 'ifhn': '', 'pid': 30178}&lt;BR /&gt;==============================================================&lt;BR /&gt;&lt;BR /&gt;thank you&lt;BR /&gt;&lt;BR /&gt;Best regard&lt;BR /&gt; John&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 01:29:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899691#M2186</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-05-27T01:29:51Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899692#M2187</link>
      <description>&lt;DIV style="margin:0px;"&gt;Hi John,&lt;BR /&gt;&lt;BR /&gt;Seems you were able to start MPD ring with firewal switched off. To be sure you can run mpdtrace.&lt;BR /&gt;&lt;BR /&gt;Could you try to do the same with firewall switched on. We do NOT recommend to use firewall for MPI application or configure it so that all ports will be available for internal connections.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;/DIV&gt;
&lt;BR /&gt;</description>
      <pubDate>Wed, 27 May 2009 05:46:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899692#M2187</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-27T05:46:39Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899693#M2188</link>
      <description>&lt;DIV style="margin: 0px; height: auto;"&gt;&lt;/DIV&gt;
Hi&lt;BR /&gt;&lt;BR /&gt;I have the same problem with command mpdboot. After execution I got the following log&lt;BR /&gt;
&lt;PRE&gt;[shell]Runing on the host n2114.nodes&lt;BR /&gt;This jobs runs on the following processors:&lt;BR /&gt;n2114.nodes n2114.nodes n2113.nodes n2113.nodes n2112.nodes n2112.nodes&lt;BR /&gt;running mpdallexit on n2114.nodes&lt;BR /&gt;LAUNCHED mpd on n2114.nodes via&lt;BR /&gt;RANNING: mpd on n2114.nodes&lt;BR /&gt;LAUNCHED mpd on n2113.nodes via n2114.nodes&lt;BR /&gt;LAUNCHED mpd on n2112.nodes via n2114.nodes&lt;BR /&gt;mpd_boot_n2114.nodes (handle_mpd_output 730): Failed to establish a socket connection with n2112.nodes:43606 : (111, 'Connection refused')&lt;BR /&gt;mpd_boot_n2114.nodes (handle_mpd_output 747): Failed to connect to mpd on n2112.nodes&lt;BR /&gt;[/shell]&lt;/PRE&gt;
Does anybody know how to fix this problem using only user access to cluster?&lt;BR /&gt;</description>
      <pubDate>Thu, 28 May 2009 09:59:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899693#M2188</guid>
      <dc:creator>smtp12357</dc:creator>
      <dc:date>2009-05-28T09:59:23Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899694#M2189</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/416928"&gt;smtp12357&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;HLAUNCHED mpd on n2114.nodes via&lt;BR /&gt;LAUNCHED mpd on n2112.nodes via n2114.nodes&lt;BR /&gt;mpd_boot_n2114.nodes (handle_mpd_output 730): Failed to establish a socket connection with n2112.nodes:43606 : (111, 'Connection refused')&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;&lt;BR /&gt;Hi smtp12357,&lt;BR /&gt;&lt;BR /&gt;Yeah, seems you have the same problem. You can start mpds but you mpds cannot open connection- it looks like ports are closed. Might be this is firewall. Could you ask sysadmin to open tcp ports for internal connections or just switch firewall off.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;</description>
      <pubDate>Thu, 28 May 2009 11:53:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899694#M2189</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-05-28T11:53:56Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899695#M2190</link>
      <description>Dmitry, Thank you for explanation.&lt;BR /&gt;I asked our sysadmin and he solved this problem. Now mpdboot works well&lt;BR /&gt;</description>
      <pubDate>Sat, 30 May 2009 19:16:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899695#M2190</guid>
      <dc:creator>smtp12357</dc:creator>
      <dc:date>2009-05-30T19:16:40Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899696#M2191</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/423452"&gt;Dmitry Kuzmin (Intel)&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;Hi John,&lt;BR /&gt;&lt;BR /&gt;Seems you were able to start MPD ring with firewal switched off. To be sure you can run mpdtrace.&lt;BR /&gt;&lt;BR /&gt;Could you try to do the same with firewall switched on. We do NOT recommend to use firewall for MPI application or configure it so that all ports will be available for internal connections.&lt;BR /&gt;&lt;BR /&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;BR /&gt;&lt;/DIV&gt;
&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;BR /&gt;Hello Dmitry:&lt;BR /&gt;&lt;BR /&gt;I have firewall switch on, then I sure master can user ssh login slave1 without password and slave1 can login master too.&lt;BR /&gt;&lt;BR /&gt;I execute follow instruction:&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;mpdboot -r ssh -f mpd.hosts -n 2 --chkuponly&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;checking cluster-slave1&lt;BR /&gt;there are 2 hosts up (counting local)&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdboot -r ssh -f mpd.hosts -n 2&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt; -- no message&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;mpdtrace&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;cluster-master&lt;BR /&gt;cluster-slave1&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -genv I_MPI_PERHOST 1 -n 2 hostname&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;mpiexec_cluster-master (mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -genv I_MPI_PERHOST 1 -n 2 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;Hello world: rank 0 of 2 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 2 running on cluster-master&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdallexit&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt; -- no message&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpirun -r ssh -f ./mpd.hosts -genv I_MPI_PERHOST 1 -n 2 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;(mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;=============== &lt;STRONG&gt;mpd2.logfile_user_090601.094125_3962&lt;/STRONG&gt; ======================&lt;BR /&gt;cluster-master_48124 (handle_rhs_input 2145): connection with the right neighboring mpd daemon was lost; attempting to re-enter the mpd ring&lt;BR /&gt;cluster-master_48124 (reenter_ring 691): reenter_ring returned 0 after 1 tries&lt;BR /&gt;cluster-master_48124 (handle_rhs_input 2152): the daemon successfully reentered the mpd ring&lt;BR /&gt;===========================================================================&lt;BR /&gt;&lt;BR /&gt;I do not know how to solve this problem...&lt;BR /&gt;I try execute follow instruction. I get very strange message.&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdboot -r ssh -f mpd.hosts -n 2&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;-- no message&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 8 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;mpiexec_cluster-master (mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 2 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;Hello world: rank 0 of 2 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 2 running on cluster-master&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 4 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;Hello world: rank 0 of 4 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 4 running on cluster-master&lt;BR /&gt;Hello world: rank 2 of 4 running on cluster-master&lt;BR /&gt;Hello world: rank 3 of 4 running on cluster-master&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 8 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;Hello world: rank 0 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 2 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 3 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 4 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 5 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 6 of 8 running on cluster-master&lt;BR /&gt;Hello world: rank 7 of 8 running on cluster-master&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;The cluster-master's CPU is Intel Pentium D 925+, this CPU just have 1 core and 2 hyper-thread.&lt;BR /&gt;First, I execute `mpiexec -n 8` I got error message, but finall this program run on cluster-master user 8 core. Is this bug?&lt;BR /&gt;&lt;BR /&gt;If you need, I can email our Teamviewer's ID and password to you.&lt;BR /&gt;thanks for your help. &lt;BR /&gt;&lt;BR /&gt;Best Regard&lt;BR /&gt; John&lt;BR /&gt;</description>
      <pubDate>Mon, 01 Jun 2009 01:44:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899696#M2191</guid>
      <dc:creator>camiyu917gmail_com</dc:creator>
      <dc:date>2009-06-01T01:44:04Z</dc:date>
    </item>
    <item>
      <title>Re: I can't run program on slave node</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899697#M2192</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="margin-top: 5px; width: 100%;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/428759"&gt;camiyu917gmail.com&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;&lt;BR /&gt;Hello Dmitry:&lt;BR /&gt;&lt;BR /&gt;I have &lt;SPAN style="background-color: #ff0000;"&gt;firewall switch on&lt;/SPAN&gt;, then I sure master can user ssh login slave1 without password and slave1 can login master too.&lt;BR /&gt;&lt;BR /&gt;I execute follow instruction:&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdallexit&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;-- no message&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #0000ff;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpirun -r ssh -f ./mpd.hosts -genv I_MPI_PERHOST 1 -n 2 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;(mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;=============== &lt;STRONG&gt;mpd2.logfile_user_090601.094125_3962&lt;/STRONG&gt; ======================&lt;BR /&gt;cluster-master_48124 (handle_rhs_input 2145): connection with the right neighboring mpd daemon was lost; attempting to re-enter the mpd ring&lt;BR /&gt;cluster-master_48124 (reenter_ring 691): reenter_ring returned 0 after 1 tries&lt;BR /&gt;cluster-master_48124 (handle_rhs_input 2152): the daemon successfully reentered the mpd ring&lt;BR /&gt;===========================================================================&lt;BR /&gt;&lt;BR /&gt;I do not know how to solve this problem...&lt;BR /&gt;I try execute follow instruction. I get very strange message.&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpdboot -r ssh -f mpd.hosts -n 2&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;-- no message&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 8 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;mpiexec_cluster-master (mpiexec 841): no msg recvd from mpd when expecting ack of request. Please examine the /tmp/mpd2.logfile_user log file on each node of the ring.&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;SPAN style="color: #993366;"&gt;&lt;SPAN style="font-size: small;"&gt;&lt;STRONG&gt;mpiexec -n 2 ./testcpp&lt;/STRONG&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;BR /&gt;====================================================&lt;BR /&gt;Hello world: rank 0 of 2 running on cluster-master&lt;BR /&gt;Hello world: rank 1 of 2 running on cluster-master&lt;BR /&gt;====================================================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;The cluster-master's CPU is Intel Pentium D 925+, this CPU just have 1 core and 2 hyper-thread.&lt;BR /&gt;First, I execute `mpiexec -n 8` I got error message, but finall this program run on cluster-master user 8 core. Is this bug?&lt;BR /&gt;&lt;BR /&gt;If you need, I can email our Teamviewer's ID and password to you.&lt;BR /&gt;thanks for your help. &lt;BR /&gt;&lt;BR /&gt;Best Regard&lt;BR /&gt;John&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;&lt;BR /&gt;Hi John,&lt;BR /&gt;&lt;BR /&gt;Fromthe first part ofyour question it seems to me that Firewall doesn't allow to establish connetion between mpiexec and mpd. To get more info you can use --verbose switch.&lt;BR /&gt;Could you send log file from cluster-slave? This is the most interesting file.&lt;BR /&gt;&lt;BR /&gt;Second part: mpd itself is smart enough to change anmpd-ring. And the message: "with the right neighboring mpd daemon was lost; attempting to re-enter the mpd ring" says that there will be new ring - in your case only one node has left - only your task will be executed on one node only.&lt;BR /&gt;&lt;BR /&gt;You can start all processes on one node - no problem, but you'll get performance not as good as you start them in parallel.&lt;BR /&gt;&lt;BR /&gt;You can write me directly &lt;A href="mailto:dmitry.kuzmin@intel.com"&gt;dmitry.kuzmin (at) intel.com&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;Best wishes,&lt;BR /&gt; Dmitry&lt;/P&gt;</description>
      <pubDate>Mon, 01 Jun 2009 09:07:31 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/I-can-t-run-program-on-slave-node/m-p/899697#M2192</guid>
      <dc:creator>Dmitry_K_Intel2</dc:creator>
      <dc:date>2009-06-01T09:07:31Z</dc:date>
    </item>
  </channel>
</rss>

