- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
==================
mpiexec_cluster-master (mpiexec 841): no msg recvd from mpd when expecting ack o f request. Please examine the /tmp/mpd2.logfile_user log file on each node of th e ring.
mpdallexit: cannot connect to local mpd (/tmp/mpd2.console_user_090519.111321_4345); possible causes:
1. no mpd is running on this host
2. an mpd is running but was started without a "console" (-n option)
==================
follow is the logfile:
====================================================
======== mpd2.logfile_cluster-master_user_090519.110901_4130 =======
====================================================
logfile for mpd with pid 4166
cluster-master_45346: mpd_uncaught_except_tb handling:
exceptions.IndexError: list index out of range
/opt/intel/impi/3.2.0.011/bin64/mpd.py 132 pin_Join_list
list.append(l1+l2+l3)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 421 pin_CpuList
ordids = pin_Join_list(info['pack_id'],info['core_id'],info['thread_id'],space)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2535 run_one_cli
self.PinList = pin_CpuList(gl_envvars, self.PinCase, self.PinSpace,self.CpuInfo,len(self.RanksToBeRunHere))
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2369 do_mpdrun
rv = self.run_one_cli(lorank,msg)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1605 handle_console_input
self.do_mpdrun(msg)
/opt/intel/impi/3.2.0.011/bin64/mpdlib.py 613 handle_active_streams
handler(stream,*args)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1262 runmainloop
rv = self.streamHandler.handle_active_streams(timeout=8.0)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1231 run
self.runmainloop()
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2762 ?
mpd.run()
====================================================
====================================================
========== mpd2.logfile_user_090519.110350_3668 ===============
====================================================
logfile for mpd with pid 3704
cluster-master_37151: mpd_uncaught_except_tb handling:
exceptions.IndexError: list index out of range
/opt/intel/impi/3.2.0.011/bin64/mpd.py 132 pin_Join_list
list.append(l1+l2+l3)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 421 pin_CpuList
ordids = pin_Join_list(info['pack_id'],info['core_id'],info['thread_id'],space)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2535 run_one_cli
self.PinList = pin_CpuList(gl_envvars, self.PinCase, self.PinSpace,self.CpuInfo,len(self.RanksToBeRunHere))
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2369 do_mpdrun
rv = self.run_one_cli(lorank,msg)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1605 handle_console_input
self.do_mpdrun(msg)
/opt/intel/impi/3.2.0.011/bin64/mpdlib.py 613 handle_active_streams
handler(stream,*args)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1262 runmainloop
rv = self.streamHandler.handle_active_streams(timeout=8.0)
/opt/intel/impi/3.2.0.011/bin64/mpd.py 1231 run
self.runmainloop()
/opt/intel/impi/3.2.0.011/bin64/mpd.py 2762 ?
mpd.run()
====================================================
OS: SuSE Linux Enterprise 11
I have set env variable I_MPI_CPUINFO="/proc/cpuinfo",
AND execute "mpdboot -n 2 -f ./mpd.hosts" is OK.
Please Help me.... Thanks...
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi there,
Seems you problem is related to the ssh connection. (By default Intel MPI library uses rsh connection.)
You should configure your authentication so you'll be able to login on all servers without a password.
To setup public key authentication method follow thefollowing steps:
1. Public key generation
local> ssh-keygen -t dsa -f .ssh/id_dsa
When a password is asked, leave it just pressing
Two new files id_dsa and id_dsa.pub are created in .ssh directory. The last one is a public part.
2. Public key distribution to remote nodes
Go to the .ssh directory. Copy the public key to the remote machine.
local> cd .ssh
local> scp id_dsa.pub user@remote:~/.ssh/id_dsa.pub
Logon into remote machine and go to the .ssh directory on the remote side.
local> ssh user@remote
remote> cd .ssh
Add the client's public key to the known public keys on the remote server.
remote> cat id_dsa.pub >> authorized_keys2
remote> chmod 640 authorized_keys2
remote> rm id_dsa.pub
remote> exit
Next time you log into the remote server, no password will be asked.
Note that ssh setup depends on the ssh client distribution.
Hope this helps.
Best wishes.
Dmitry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your help.
My cluster domain just two node (cluster-master & cluster-slave1).
Now, I can use ssh login master from slave without password, also can use ssh login slave from master without password.
But I still get the same problem.
Very thanks for you help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello camiyu917,
Thanks for the output from the mpd logfile. Which version of the Intel MPI Library do you have installed? Is it Intel MPI Library 3.2, which was released November 2008? (This information is available in the mpisupport.txt file in the installation directory).
If yes, this might be a known incompatibility between the Intel MPI Library and the latest version of OpenSuSE 11.1 (similar to SLES 11). You can get more details at the following forum thread.
You have two options to resolve this issue:
- Upgrade to the latest Intel MPI Library 3.2 Update 1, available for download from the Intel Registration Center.
- Set the environment variable
I_MPI_CPUINFO
to proc. You can do so by running:export I_MPI_CPUINFO=proc
Let us know if either of those options work out for you.
Regards,
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Upgrade to the latest Intel MPI Library 3.2 Update 1, available for download from the Intel Registration Center.
- Set the environment variable
I_MPI_CPUINFO
to proc. You can do so by running:export I_MPI_CPUINFO=proc
I install Intel Cluster Toolkit Compiler Edition for Linux_3.2.020 ,
The problem is solve, this problem is environment variable about cpuinfo.
after execute "export I_MPI_CPUINFO=proc" is OK.
Now, I can run Intel MPI in SuSE Linux Enterprise 11.
Very thanks for your help ~
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Very thanks for your help ~
You are very welcome! I'm glad things worked out for you. There will be an update to the Intel Cluster Toolkit Compiler Edition 3.2 package sometime over the summer, which will include a fix to this issue. If interested, keep an eye out on the forums since we'll be making an announcement when the new version is out.
Regards,
~Gergana

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page