Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Beginner
13 Views

Bizarre authenticity of host issue when running across multiple nodes with Intel MPI

Jump to solution

I am attempting to run a job across three nodes.  I have configured passwordless ssh and it definitely works in between every node (each node can ssh to the other two without a password).  The known_hosts file is definitely correct and all 3 nodes have identical .ssh directories.  I have also tried adding the keys to ssh-agent, although I'm not sure if that was necessary either as I didn't specify a pass phrase when generating the id_rsa key (I know this is terrible security but it's temporary for the sake of testing).

I can run a job across nodes 1 and 2 simultaneously without any difficulty, however if I try to use node 3 as well (or just nodes 1 and 3, or nodes 2 and 3) then the terminal is spammed with, "The authenticity of host 'node3 (IP of node 3)' can't be established." and there's no way to enter "yes" (even though I shouldn't have to in the first place as node 3's key is already in the known_hosts file of nodes 1 and 2).

If I try to launch the job on node 3, then I receive the same messages in the terminal with the hostname/IP of nodes 1 and 2.  I am able to run the job solely on node 3.

Any help would be greatly appreciated as this has been a real headache.  Clearly there is something I have overlooked even though the configuration and hardware of these three nodes is almost identical.  I am using Intel MPI 5.0.0.028 and CentOS 6.6.  The nodes are communicating over an Infiniband interface.  Thanks for any input.

0 Kudos

Accepted Solutions
Highlighted
13 Views

Hey Greg,

Interesting, it seems like you're doing all the correct things.  We ship an sshconnectivity script with the Intel MPI Library install files.  Have you tried running that on your nodes?  I should do all the steps necessary for passwordless ssh setup.

After you untar the l_mpi_p_5.0.0.028.tgz package, in the l_mpi_p_5.0.0.028/ directory, you should see 'sshconnectivity.exp'.  You'll need a file that contains a list of all your nodes:

$ cat machines.LINUX
node1
node2
node3

and you need to provide that to the ssh script:

$ sshconnectivity.exp machines.LINUX

It'll prompt you in the correct places for a pass phrase (can leave it blank).

Let me know how this works.

~Gergana

gss

View solution in original post

0 Kudos
3 Replies
Highlighted
14 Views

Hey Greg,

Interesting, it seems like you're doing all the correct things.  We ship an sshconnectivity script with the Intel MPI Library install files.  Have you tried running that on your nodes?  I should do all the steps necessary for passwordless ssh setup.

After you untar the l_mpi_p_5.0.0.028.tgz package, in the l_mpi_p_5.0.0.028/ directory, you should see 'sshconnectivity.exp'.  You'll need a file that contains a list of all your nodes:

$ cat machines.LINUX
node1
node2
node3

and you need to provide that to the ssh script:

$ sshconnectivity.exp machines.LINUX

It'll prompt you in the correct places for a pass phrase (can leave it blank).

Let me know how this works.

~Gergana

gss

View solution in original post

0 Kudos
Highlighted
Beginner
13 Views

Hi Gergana,

Running that script did the trick, and I am able to launch a job across all 3 nodes now!  Thanks very much for your help!

Best regards,

Greg

0 Kudos
Highlighted
13 Views

Glad to hear it worked :)  Have fun with MPI!

~Gergana

gss
0 Kudos