- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am attempting to run a job across three nodes. I have configured passwordless ssh and it definitely works in between every node (each node can ssh to the other two without a password). The known_hosts file is definitely correct and all 3 nodes have identical .ssh directories. I have also tried adding the keys to ssh-agent, although I'm not sure if that was necessary either as I didn't specify a pass phrase when generating the id_rsa key (I know this is terrible security but it's temporary for the sake of testing).
I can run a job across nodes 1 and 2 simultaneously without any difficulty, however if I try to use node 3 as well (or just nodes 1 and 3, or nodes 2 and 3) then the terminal is spammed with, "The authenticity of host 'node3 (IP of node 3)' can't be established." and there's no way to enter "yes" (even though I shouldn't have to in the first place as node 3's key is already in the known_hosts file of nodes 1 and 2).
If I try to launch the job on node 3, then I receive the same messages in the terminal with the hostname/IP of nodes 1 and 2. I am able to run the job solely on node 3.
Any help would be greatly appreciated as this has been a real headache. Clearly there is something I have overlooked even though the configuration and hardware of these three nodes is almost identical. I am using Intel MPI 5.0.0.028 and CentOS 6.6. The nodes are communicating over an Infiniband interface. Thanks for any input.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Greg,
Interesting, it seems like you're doing all the correct things. We ship an sshconnectivity script with the Intel MPI Library install files. Have you tried running that on your nodes? I should do all the steps necessary for passwordless ssh setup.
After you untar the l_mpi_p_5.0.0.028.tgz package, in the l_mpi_p_5.0.0.028/ directory, you should see 'sshconnectivity.exp'. You'll need a file that contains a list of all your nodes:
$ cat machines.LINUX node1 node2 node3
and you need to provide that to the ssh script:
$ sshconnectivity.exp machines.LINUX
It'll prompt you in the correct places for a pass phrase (can leave it blank).
Let me know how this works.
~Gergana
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey Greg,
Interesting, it seems like you're doing all the correct things. We ship an sshconnectivity script with the Intel MPI Library install files. Have you tried running that on your nodes? I should do all the steps necessary for passwordless ssh setup.
After you untar the l_mpi_p_5.0.0.028.tgz package, in the l_mpi_p_5.0.0.028/ directory, you should see 'sshconnectivity.exp'. You'll need a file that contains a list of all your nodes:
$ cat machines.LINUX node1 node2 node3
and you need to provide that to the ssh script:
$ sshconnectivity.exp machines.LINUX
It'll prompt you in the correct places for a pass phrase (can leave it blank).
Let me know how this works.
~Gergana
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Gergana,
Running that script did the trick, and I am able to launch a job across all 3 nodes now! Thanks very much for your help!
Best regards,
Greg
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Glad to hear it worked :) Have fun with MPI!
~Gergana
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page