Software Archive
Read-only legacy content
17061 Discussions

MPI communication from Phi to host

j0e
New Contributor I
619 Views

I have setup the Phi card for MPI based on: http://software.intel.com/en-us/articles/using-the-intel-mpi-library-on-intel-xeon-phi-coprocessor-systems that was posted in March 2013.

I can run MPI code on host from host as well as on coprocessor from coprocessor; however, when I try to run from host to coprocessor (or host+coprocessor), I get the following error:

$ mpirun -hosts Axial-mic0 -n 2 ~/testMPI-MIC.MIC
[proxy:0:0@Axial-mic0.localdomain] HYDU_sock_connect (./utils/sock/sock.c:241): unable to connect from "Axial-mic0.localdomain" to "172.31.1.254" (No route to host)
[proxy:0:0@Axial-mic0.localdomain] main (./pm/pmiserv/pmip.c:353): unable to connect to server 172.31.1.254 at port 32873 (check for firewalls!)

This looks like a simple network problem, but I have yet to solve it.  Any suggestions would be appreciated. Thanks!

-joe

0 Kudos
2 Replies
Mian_L_
Beginner
619 Views

Hi Joe,

I once had the same problem as yours. I turn off all firewall then the problem is solved. BTW, once you fix your issue, can you tell me your result of bandwidth from host memory to MIC cards using MPI? Since currently I don't know why the bandwidth on my server is unreasonably low.

Thanks,
Mian 

0 Kudos
j0e
New Contributor I
619 Views

Thanks Mian! It was a firewall problem.  

Does anyone know what ports need to be left open or how I can setup a trust relationship between the host and coprocessor?

So, I can now invoke MPI runs on both the CPU and coprocessor from the host, but when I try to run code on BOTH processor and coprocessor at the same time, it just hangs.  That is:

[php]mpirun -hosts Axial -n 8 ~/fProjects/testMPI-MIC/testMPI-MIC [OK]

mpirun -hosts Axial-mic0 -n 10 ~/fProjects/testMPI-MIC/testMPI-MIC.MIC [OK]

mpirun -hosts Axial -n 8 ~/fProjects/testMPI-MIC/testMPI-MIC : -hosts Axial-mic0 -n 10 ~/fProjects/testMPI-MIC/testMPI-MIC.MIC [HANGS]

[/php]

Since no errors are given, it makes it more difficult to determine cause.  I'm running Update 2 of the mpss....suggestions welcome!

0 Kudos
Reply