<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: mpd error in Intel® MPI Library</title>
    <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901747#M2226</link>
    <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;I was mistake.I haven't now account where "mpdboot" work wright! I can launch task only with command mpirun (without PBS). I have error:&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpdboot -r ssh -n 6 -f mpd.hosts&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 828): Failed to establish a socket connection with ib-cn01:42335 : (111, 'Connection refused')&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 845): failed to connect to mpd on ib-cn01&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;!!!&lt;/P&gt;
&lt;P&gt;killing of all process for user qwer on all nodes hope start mpdboot manualy. Now mpdboot boot correct! And boot under PBS wright. WHY???&lt;/P&gt;</description>
    <pubDate>Sat, 14 Nov 2009 06:48:01 GMT</pubDate>
    <dc:creator>altlogic09</dc:creator>
    <dc:date>2009-11-14T06:48:01Z</dc:date>
    <item>
      <title>mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901739#M2218</link>
      <description>&lt;P&gt;Hi!&lt;/P&gt;
&lt;P&gt;I have a problem with Altair PBS PRO + Intel MPI. I can launch a task with mpiexec command on several nodes. But when I try to launch this task on several nodes under PBS I get error.&lt;/P&gt;
&lt;P&gt;What I doing:&lt;BR /&gt;1) Starting mpd on nodes:&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; cat mpd.hosts&lt;BR /&gt;ib-mgr:10&lt;BR /&gt;ib-cn01:16&lt;BR /&gt;ib-cn02:16&lt;BR /&gt;ib-cn03:16&lt;BR /&gt;ib-cn04:16&lt;BR /&gt;ib-cn05:16&lt;BR /&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpdboot -n 6 -f mpd.hosts -r ssh&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;2) Cheking:&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpdtrace&lt;BR /&gt;ib-mgr&lt;BR /&gt;ib-cn04&lt;BR /&gt;ib-cn03&lt;BR /&gt;ib-cn02&lt;BR /&gt;ib-cn01&lt;BR /&gt;ib-cn05&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;3) Start mpi-program without PBS:&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpiexec -ppn 10 -n 50 /mnt/share/piex/pi -nolocal&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Process 24 on ib-cn04&lt;BR /&gt;Process 22 on ib-cn04&lt;BR /&gt;Process 13 on ib-mgr &lt;/EM&gt; [Why -nolocal ignored?]&lt;EM&gt;&lt;BR /&gt;Process 29 on ib-cn04&lt;BR /&gt;Process 21 on ib-cn04&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;...&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;Process 25 on ib-cn04&lt;BR /&gt;Process 26 on ib-cn04&lt;BR /&gt;Process 36 on ib-cn03&lt;BR /&gt;&lt;BR /&gt;pi = 3.1415926535897931&lt;BR /&gt;time = 0.435737 sec.&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;OK. Task was launched on all nodes right.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;4) Make a job file for PBS:&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; cat test.job&lt;BR /&gt;#!/bin/bash&lt;BR /&gt;&lt;BR /&gt;#PBS -q long&lt;BR /&gt;#PBS -l nodes=5:ppn=10,mem=100mb,walltime=1:30:00&lt;BR /&gt;#PBS -S /bin/bash&lt;BR /&gt;#PBS -N piex&lt;BR /&gt;&lt;BR /&gt;echo " Start  date:`/bin/date`"&lt;BR /&gt;mpiexec -ppn 10 -n 50 /mnt/share/piex/pi -nolocal&lt;BR /&gt;echo " End  date:`/bin/date`"&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;5) Start mpi program with PBS:&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; qsub test.job&lt;BR /&gt;673.mgr&lt;/EM&gt;&lt;BR /&gt;6) Where is my job?&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; qstat&lt;BR /&gt;&lt;BR /&gt;7)What happend?&lt;BR /&gt;qwer@mgr:/mnt/share/piex&amp;gt; cat piex.o673&lt;BR /&gt; Start  date:  27 13:55:47 VLAT 2009&lt;BR /&gt;mpiexec_mgr: cannot connect to local mpd (/tmp/pbs.673.mgr/mpd2.console_mgr_qwer); possible causes:&lt;BR /&gt; 1. no mpd is running on this host&lt;BR /&gt; 2. an mpd is running but was started without a "console" (-n option)&lt;BR /&gt; End  date:  27 13:55:47 VLAT 2009&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;8) Realy mpd not runnig?&lt;BR /&gt;&lt;EM&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpdtrace -l&lt;BR /&gt;ib-mgr_60696 (10.10.0.1)&lt;BR /&gt;ib-cn04_41952 (10.10.0.14)&lt;BR /&gt;ib-cn03_43736 (10.10.0.13)&lt;BR /&gt;ib-cn02_45542 (10.10.0.12)&lt;BR /&gt;ib-cn01_52394 (10.10.0.11)&lt;BR /&gt;ib-cn05_44083 (10.10.0.15)&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;What I doing else:&lt;BR /&gt;a) set env var&lt;BR /&gt;&lt;EM&gt;qwer@mgr: I_MPI_CPUINFO=/proc/cpuinfo &lt;/EM&gt;&lt;BR /&gt;result - nothing.&lt;BR /&gt;b) try to find connection port, which locking PBS for mpd. I think, that pbs search connection with mpd deamon not in right port.&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;What reason of my problems?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;About my system:&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;mgr:~ # cat /etc/SuSE-release&lt;/STRONG&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;SUSE Linux Enterprise Server 10 (x86_64)&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;VERSION = 10&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;PATCHLEVEL = 1&lt;/EM&gt;&lt;/P&gt;
&lt;EM&gt; &lt;/EM&gt;
&lt;P&gt;&lt;EM&gt;&lt;STRONG&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpiexec -V&lt;/STRONG&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Intel MPI Library for Linux, 64-bit applications, Version 3.2.1  Build 20090312&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Copyright (C) 2003-2009 Intel Corporation.  All rights reserved.&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;STRONG&gt;mgr:~ # qstat -Bf&lt;/STRONG&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Server: mgr&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; server_state = Active&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; server_host = extmgr.hp&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; scheduling = True&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; total_jobs = 1&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1 Exiting:0 Begun&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; :0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; acl_roots = foo,root@mgr&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; default_queue = workq&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; log_events = 511&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; mail_from = adm&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; query_other_jobs = True&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; resources_default.ncpus = 1&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; default_chunk.ncpus = 1&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; resources_assigned.mem = 0kb&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; resources_assigned.ncpus = 1&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; resources_assigned.nodect = 1&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; scheduler_iteration = 600&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; FLicenses = 95&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; resv_enable = True&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; node_fail_requeue = 310&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; max_array_size = 10000&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; pbs_license_file_location = 7788@mgr&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; pbs_license_min = 0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; pbs_license_max = 2147483647&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; pbs_license_linger_time = 3600&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; license_count = Avail_Global:95 Avail_Local:0 Used:1 High_Use:96&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; pbs_version = PBSPro_10.0.0.82981&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt; eligible_time_enable = False&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;STRONG&gt;qwer@mgr:/mnt/share/piex&amp;gt; cpuinfo&lt;/STRONG&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Architecture  : x86_64&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Hyperthreading: disabled&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Packages   : 4&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Cores      : 16&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Processors : 16&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;=====  Processor identification  =====&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Processor       Thread  Core    Package&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;0               0       0       0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;1               0       0       2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;2               0       0       4&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;3               0       0       6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;4               0       1       0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;5               0       1       2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;6               0       1       4&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;7               0       1       6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;8               0       2       0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;9               0       2       2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;10              0       2       4&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;11              0       2       6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;12              0       3       0&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;13              0       3       2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;14              0       3       4&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;15              0       3       6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;=====  Processor placement  =====&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Package Cores           Processors&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;0       0,1,2,3         0,4,8,12&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;2       0,1,2,3         1,5,9,13&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;4       0,1,2,3         2,6,10,14&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;6       0,1,2,3         3,7,11,15&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;=====  Cache sharing  =====&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;Cache   Size            Processors&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;L1      32  KB          no sharing&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;L2      4   MB          (0,4)(1,5)(2,6)(3,7)(8,12)(9,13)(10,14)(11,15)&lt;/EM&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Oct 2009 09:04:28 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901739#M2218</guid>
      <dc:creator>altlogic09</dc:creator>
      <dc:date>2009-10-27T09:04:28Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901740#M2219</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Avaible resourses:&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;&lt;STRONG&gt;qwer@mgr:/mnt/share/piex&amp;gt; pbsnodes -a&lt;/STRONG&gt;&lt;BR /&gt;mgr&lt;BR /&gt; Mom = extmgr.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; Priority = 0&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = extmgr&lt;BR /&gt; resources_available.mem = 32960976kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = mgr&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;BR /&gt;&lt;BR /&gt;cn01&lt;BR /&gt; Mom = cn01.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = cn01&lt;BR /&gt; resources_available.mem = 32960896kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = cn01&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;BR /&gt;&lt;BR /&gt;cn02&lt;BR /&gt; Mom = cn02.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = cn02&lt;BR /&gt; resources_available.mem = 32960896kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = cn02&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;BR /&gt;&lt;BR /&gt;cn03&lt;BR /&gt; Mom = cn03.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = cn03&lt;BR /&gt; resources_available.mem = 32960896kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = cn03&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;BR /&gt;&lt;BR /&gt;cn04&lt;BR /&gt; Mom = cn04.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = cn04&lt;BR /&gt; resources_available.mem = 32960896kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = cn04&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;BR /&gt;&lt;BR /&gt;cn05&lt;BR /&gt; Mom = cn05.hp&lt;BR /&gt; ntype = PBS&lt;BR /&gt; state = free&lt;BR /&gt; pcpus = 16&lt;BR /&gt; resources_available.arch = linux&lt;BR /&gt; resources_available.host = cn05&lt;BR /&gt; resources_available.mem = 32960896kb&lt;BR /&gt; resources_available.ncpus = 16&lt;BR /&gt; resources_available.vnode = cn05&lt;BR /&gt; resources_assigned.mem = 0kb&lt;BR /&gt; resources_assigned.ncpus = 0&lt;BR /&gt; resources_assigned.vmem = 0kb&lt;BR /&gt; resv_enable = True&lt;BR /&gt; sharing = default_shared&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Tue, 27 Oct 2009 09:44:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901740#M2219</guid>
      <dc:creator>altlogic09</dc:creator>
      <dc:date>2009-10-27T09:44:05Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901741#M2220</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/449160"&gt;altlogic09&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;BR /&gt;&lt;EM&gt;&lt;BR /&gt;
&lt;P&gt;&lt;EM&gt;7)What happend?&lt;BR /&gt;qwer@mgr:/mnt/share/piex&amp;gt; cat piex.o673&lt;BR /&gt; Start  date:  27 13:55:47 VLAT 2009&lt;BR /&gt;mpiexec_mgr: cannot connect to local mpd (/tmp/pbs.673.mgr/mpd2.console_mgr_qwer); possible causes:&lt;BR /&gt; 1. no mpd is running on this host&lt;BR /&gt; 2. an mpd is running but was started without a "console" (-n option)&lt;BR /&gt; End  date:  27 13:55:47 VLAT 2009&lt;/EM&gt;&lt;BR /&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
In the PBS script, run mpdboot (preferably with -r ssh), using the PBS_NODEFILE node list, so your job has its own mpd on the assigned group of nodes. At the end of your script, run mpdallexit. If you prefer, use mpirun so as to combine mpdboot, mpiexec, and mpdallexit.&lt;BR /&gt;It used to be OK to expect the PBS script to inherit the mpivars path settings from the session where you submit the job. Lately, it's necessary to set up the entire environment in the script.&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 27 Oct 2009 13:43:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901741#M2220</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-10-27T13:43:36Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901742#M2221</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;What does it mean?&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt; Lately, it's necessary to set up the entire environment in the script.&lt;BR /&gt;&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;What variables I should set?&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;My new bath file for job:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;EM&gt;#!/bin/bash&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -q long&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l nodes=6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l ncpus=90&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l mem=2GB&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l walltime=240:00:00&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -S /bin/bash&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -N v3&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;echo " Start  date:`/bin/date`"&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;cd /mnt/share/testfort/v3_cp&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;#mpdboot -n 6 -f mpd.hosts -r ssh&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#mpiexec -n 90 ./vl_2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#mpdallexit&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;mpirun -r ssh -n 90 -f mpd.hosts ./vl_2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;echo " End  date:`/bin/date`"&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;I get error message, If I launched a task with mpdboot-mpiexec-mpdallexit:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;&lt;EM&gt;mpdboot_mgr (handle_mpd_output 828): Failed to establish a socket connection with ib-cn01:58575 : (111, 'Connection refused')&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 845): failed to connect to mpd on ib-cn01&lt;BR /&gt;mpiexec_mgr: cannot connect to local mpd (/tmp/pbs.741.mgr/mpd2.console_mgr_zaytsev); possible causes:&lt;BR /&gt; 1. no mpd is running on this host&lt;BR /&gt; 2. an mpd is running but was started without a "console" (-n option)&lt;BR /&gt;mpdallexit: cannot connect to local mpd (/tmp/pbs.741.mgr/mpd2.console_mgr_zaytsev); possible causes:&lt;BR /&gt; 1. no mpd is running on this host&lt;BR /&gt; 2. an mpd is running but was started without a "console" (-n option)&lt;BR /&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;I get error message, If I launched a task with mpirun:&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;mpdboot_mgr (handle_mpd_output 837): failed to ping mpd on cn01; received output={&lt;/EM&gt;}&lt;BR /&gt;&lt;BR /&gt;If I launch task not under PBS - all OK (with mpirun and with mpdboot-mpiexec-mpdallexit).&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;What does it mean error code of mpi 827, 828, 845???&lt;/P&gt;</description>
      <pubDate>Wed, 28 Oct 2009 08:53:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901742#M2221</guid>
      <dc:creator>altlogic09</dc:creator>
      <dc:date>2009-10-28T08:53:52Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901743#M2222</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;Quoting - &lt;A href="https://community.intel.com/en-us/profile/449160"&gt;altlogic09&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="background-color:#E5E5E5; padding:5px;border: 1px; border-style: inset;margin-left:2px;margin-right:2px;"&gt;&lt;BR /&gt;&lt;EM&gt;
&lt;DIV style="margin:0px;"&gt;&lt;/DIV&gt;
&lt;BR /&gt;
&lt;P&gt;&lt;EM&gt;#!/bin/bash&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -q long&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l nodes=6&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l ncpus=90&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l mem=2GB&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -l walltime=240:00:00&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -S /bin/bash&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;#PBS -N v3&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;EM&gt;echo " Start  date:`/bin/date`"&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;cd /mnt/share/testfort/v3_cp&lt;/EM&gt;&lt;/P&gt;
&lt;P&gt;&lt;EM&gt;mpirun -r ssh -n 90 -f mpd.hosts ./vl_2&lt;/EM&gt;&lt;EM&gt;&lt;BR /&gt;&lt;/EM&gt;&lt;/P&gt;
&lt;/EM&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
Where did you get mpd.hosts? In the PBS installations I've seen, the assigned node list appears in $(PBS_NODEFILE) or some such. If you want to use mpd.hosts, you would have to replace its contents by copying the list passed to your job by PBS. Is ib-cn01 one of the nodes allocated to your job by PBS? How do you know? &lt;BR /&gt;</description>
      <pubDate>Wed, 28 Oct 2009 13:24:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901743#M2222</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2009-10-28T13:24:13Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901744#M2223</link>
      <description>&lt;P&gt;Hi altlogic09,&lt;/P&gt;
&lt;P&gt;Just as a quick clarification on Tim's comments above: the Intel MPI Library is integrated into PBS Pro enough, so you don't have to specify a hosts file when running under the scheduler. I recommend you change your batch file to the following:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;#!/bin/bash&lt;BR /&gt;#PBS -q long&lt;BR /&gt;#PBS -l nodes=6&lt;BR /&gt;#PBS -l ncpus=90&lt;BR /&gt;#PBS -l mem=2GB&lt;BR /&gt;#PBS -l walltime=240:00:00&lt;BR /&gt;#PBS -S /bin/bash&lt;BR /&gt;#PBS -N v3&lt;BR /&gt;echo " Start date:`/bin/date`"&lt;BR /&gt;cd /mnt/share/testfort/v3_cp&lt;BR /&gt;&lt;STRONG&gt;mpirun -r ssh -n 90 ./vl_2&lt;/STRONG&gt;&lt;BR /&gt;echo " End date:`/bin/date`"&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Note how you don't have to specify the -f option. That's because the Intel MPI Library grabs the list of hosts from PBS directly. Of course, make sure you run &lt;CODE&gt;mpdallexit&lt;/CODE&gt; to clean up any existing MPDs on the cluster before you submit your new job.&lt;/P&gt;
&lt;P&gt;You can certainly also use the &lt;CODE&gt;mpdboot-mpiexec-mpdallexit&lt;/CODE&gt; schema under PBS, but that would involve you making sure you're picking up the correct hosts file. Here's a sample based on your batch script:&lt;/P&gt;
&lt;BLOCKQUOTE&gt;#!/bin/bash&lt;BR /&gt;#PBS -q long&lt;BR /&gt;#PBS -l nodes=6&lt;BR /&gt;#PBS -l ncpus=90&lt;BR /&gt;#PBS -l mem=2GB&lt;BR /&gt;#PBS -l walltime=240:00:00&lt;BR /&gt;#PBS -S /bin/bash&lt;BR /&gt;#PBS -N v3&lt;BR /&gt;echo " Start date:`/bin/date`"&lt;BR /&gt;cd /mnt/share/testfort/v3_cp&lt;BR /&gt;&lt;STRONG&gt;NHOSTS=`cat $PBS_NODEFILE|wc -l`&lt;BR /&gt;mpdboot -n $NHOSTS -f $PBS_NODEFILE -r ssh&lt;BR /&gt;mpiexec -n 90 ./vl_2&lt;BR /&gt;mpdallexit&lt;BR /&gt;&lt;/STRONG&gt;echo " End date:`/bin/date`"&lt;/BLOCKQUOTE&gt;
&lt;P&gt;As you can see, using &lt;CODE&gt;mpirun&lt;/CODE&gt; is easier. I hope this helps. Let us know how it goes.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;</description>
      <pubDate>Wed, 28 Oct 2009 21:06:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901744#M2223</guid>
      <dc:creator>Gergana_S_Intel</dc:creator>
      <dc:date>2009-10-28T21:06:25Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901745#M2224</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;A href="https://community.intel.com/en-us/profile/198675"&gt;Now I have an error, when start mpd without PBS under other user&lt;/A&gt;&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;BR /&gt;&lt;EM&gt;zaytsev@mgr:/mnt/share/testfort/v3_cp&amp;gt; mpdboot -f mpd.hosts -n 6 -r ssh&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 837): failed to ping mpd on cn01; received output={}&lt;/EM&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;What is the error?&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;I have access to any nodes without password. hasn't files identy.pub &amp;amp;identy in directory ~/.ssh. Is it good?&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;BR /&gt;&lt;EM&gt;zaytsev@mgr:~/.ssh&amp;gt; l&lt;BR /&gt; 53316&lt;BR /&gt;drwx------  2 zaytsev toguusers     4096 2009-11-03 18:04 ./&lt;BR /&gt;drwx------ 27 zaytsev users         4096 2009-10-28 10:58 ../&lt;BR /&gt;-rw-r--r--  1 zaytsev users          393 2009-07-17 15:55 authorized_keys2&lt;BR /&gt;-rw-r--r--  1 zaytsev users            0 2009-11-03 18:04 cat&lt;BR /&gt;-rw-------  1 zaytsev users         1675 2009-07-17 15:55 id_rsa&lt;BR /&gt;-rw-r--r--  1 zaytsev users          393 2009-07-17 15:55 id_rsa.pub&lt;BR /&gt;-rw-r--r--  1 zaytsev users         2930 2009-11-03 17:30 known_hosts&lt;BR /&gt;-rw-r--r--  1 zaytsev users     54511357 2009-10-22 22:49 VNI.IMSL.Fortran.Numerical.Library.v6.0.for.Sun.Studio.12.LINUX.EM64T-TBE.rar&lt;/EM&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;&lt;EM&gt;zaytsev@mgr:~/.ssh&amp;gt; cat known_hosts&lt;BR /&gt;cn01,10.0.0.11 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;cn02,10.0.0.12 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;cn03,10.0.0.13 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;cn04,10.0.0.14 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;cn05,10.0.0.15 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-cn01,10.10.0.11 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-cn02,10.10.0.12 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-cn03,10.10.0.13 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-cn04,10.10.0.14 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-cn05,10.10.0.15 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAngZyLEl/RS+Rxo5tmGxT/bX13OjQlRWGOmzgMI0dOvANRxC8OwknURkm50yDU/cOkJf8JZc1g0AJCNUZs4dvXZWcmJlOzJO+j7VRv7Ei/R2XHur6pmyeCQcl0dgb4piL2HAd/cH8t9A4bP1RWzlfwyNIHd2/f68SqmeHHmdzelU=&lt;BR /&gt;ib-mgr,10.10.0.1 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAulRy7M+gVL2+mvg7+QGzhEbW8Hk2H7AxtqEjmZ6iZkaxwdbVMEfxpsgsrJ9EcWQWiGJ4K3qfKz+9dpfq0AskZNOnI0cZdeolpSObgLiQva6g/69dYrzx1WLlf98bU1YMuZ5Cll2PTcHHpoTCC30hkDVeRcifKzR9FRSIr9MtF+s=&lt;BR /&gt;mgr,10.0.0.1 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAulRy7M+gVL2+mvg7+QGzhEbW8Hk2H7AxtqEjmZ6iZkaxwdbVMEfxpsgsrJ9EcWQWiGJ4K3qfKz+9dpfq0AskZNOnI0cZdeolpSObgLiQva6g/69dYrzx1WLlf98bU1YMuZ5Cll2PTcHHpoTCC30hkDVeRcifKzR9FRSIr9MtF+s=&lt;BR /&gt;10.10.190.10 ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAIEAulRy7M+gVL2+mvg7+QGzhEbW8Hk2H7AxtqEjmZ6iZkaxwdbVMEfxpsgsrJ9EcWQWiGJ4K3qfKz+9dpfq0AskZNOnI0cZdeolpSObgLiQva6g/69dYrzx1WLlf98bU1YMuZ5Cll2PTcHHpoTCC30hkDVeRcifKzR9FRSIr9MtF+s=&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Tue, 03 Nov 2009 08:09:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901745#M2224</guid>
      <dc:creator>altlogic09</dc:creator>
      <dc:date>2009-11-03T08:09:37Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901746#M2225</link>
      <description>&lt;P&gt;Hi altlogic09,&lt;/P&gt;
&lt;P&gt;Well, since you have an account where this works, and an account where this doesn't, I would say compare the environments of the two and see how they differ.&lt;/P&gt;
&lt;P&gt;For example, on our local clusters, my account has the &lt;EM&gt;authorized_keys&lt;/EM&gt; file under the &lt;CODE&gt;.ssh&lt;/CODE&gt; directory, not &lt;EM&gt;authorized_keys2&lt;/EM&gt;. I'm not sure if the ssh settings require a specific name. Your &lt;EM&gt;known_hosts&lt;/EM&gt; file looks good enough, assuming no corruption in the encription lines.&lt;/P&gt;
&lt;P&gt;Also, the Intel MPI Library creates some logfiles for the user in the &lt;CODE&gt;/tmp&lt;/CODE&gt; directory on the &lt;STRONG&gt;mgr&lt;/STRONG&gt; and &lt;STRONG&gt;cn01&lt;/STRONG&gt; nodes. Those would be good to look at.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;~Gergana&lt;/P&gt;</description>
      <pubDate>Tue, 03 Nov 2009 21:07:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901746#M2225</guid>
      <dc:creator>Gergana_S_Intel</dc:creator>
      <dc:date>2009-11-03T21:07:03Z</dc:date>
    </item>
    <item>
      <title>Re: mpd error</title>
      <link>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901747#M2226</link>
      <description>&lt;DIV style="margin:0px;"&gt;
&lt;DIV id="quote_reply" style="width: 100%; margin-top: 5px;"&gt;
&lt;DIV style="margin-left:2px;margin-right:2px;"&gt;I was mistake.I haven't now account where "mpdboot" work wright! I can launch task only with command mpirun (without PBS). I have error:&lt;BR /&gt;&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;P&gt;qwer@mgr:/mnt/share/piex&amp;gt; mpdboot -r ssh -n 6 -f mpd.hosts&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 828): Failed to establish a socket connection with ib-cn01:42335 : (111, 'Connection refused')&lt;BR /&gt;mpdboot_mgr (handle_mpd_output 845): failed to connect to mpd on ib-cn01&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;!!!&lt;/P&gt;
&lt;P&gt;killing of all process for user qwer on all nodes hope start mpdboot manualy. Now mpdboot boot correct! And boot under PBS wright. WHY???&lt;/P&gt;</description>
      <pubDate>Sat, 14 Nov 2009 06:48:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-MPI-Library/mpd-error/m-p/901747#M2226</guid>
      <dc:creator>altlogic09</dc:creator>
      <dc:date>2009-11-14T06:48:01Z</dc:date>
    </item>
  </channel>
</rss>

