Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

2 nodes on Windows 7 x64 assistance

jimdempseyatthecove
Honored Contributor III
1,356 Views

Installed Parallel Studio XE 2016 Update 2 Cluster edition on Windows 7 Pro x64 (installed on two systems) and attempting to run the MPI test program.

Each system can run the test program on itself, but I cannot run on other system from either system.

hydra_service.exe can be -remove and -install correctly.

I can see and copy files from each system to the other system (to shared folder), test program installed in shared folder.

C:\Test\TestMPI_CPP\TestMPI_CPP>mpiexec.exe -n 8 -hosts thor \Downloads\testmpi_cpp.exe
Hello world: rank 0 of 8 running on Thor
Hello world: rank 1 of 8 running on Thor
Hello world: rank 2 of 8 running on Thor
Hello world: rank 3 of 8 running on Thor
Hello world: rank 4 of 8 running on Thor
Hello world: rank 5 of 8 running on Thor
Hello world: rank 6 of 8 running on Thor
Hello world: rank 7 of 8 running on Thor

C:\Test\TestMPI_CPP\TestMPI_CPP>mpiexec.exe -n 8 -hosts i72600k \Downloads\testmpi_cpp.exe
Error connecting to the Service
[mpiexec@Thor] ..\hydra\utils\sock\sock.c (270): unable to connect from "Thor" to "i72600k" (No error)

C:\Test\TestMPI_CPP\TestMPI_CPP>dir \\i72600k\Downloads\TestMPI_CPP.exe /b
TestMPI_CPP.exe

The results above are same (with name changes) when run on other host.

I've run mpiexec -register on both systems.

Any hints would be helpful

Jim Dempsey

0 Kudos
9 Replies
James_T_Intel
Moderator
1,356 Views

Do you have any firewalls between the two systems?  Can each ping the other by name?

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,356 Views

Yes. Both can ping each other by name.

This would have been self-evident by DIR \\name\folder working.

Windows Firewall on both systems have checks on

Intel(R) MPI Library Process Manager, Intel

Anything else need to be enabled? (added)

Jim Dempsey

 

0 Kudos
James_T_Intel
Moderator
1,356 Views

You will probably need to set an exception for the program you are trying to run as well.  See https://software.intel.com/en-us/articles/firewalls-and-mpi for general steps for working with a firewall.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,356 Views

James,

After a bit of work, I had to add a Firewall exception to allow port 8679 incoming TCP packets.

At least the sample test program now runs.

Jim Dempsey

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,356 Views

Well this is interesting

While on system A I can mpiexec run on system A or B
and
While on system B I can mpiexec run on system B or A

From either, I cannot run on both

The documentation gives conflicting information as to the command line for multiple hosts.

Can you provide a correct command line? (run Foo.exe on systems A and B)

Jim Dempsey

0 Kudos
James_T_Intel
Moderator
1,356 Views

There are several different ways to do that.  Here are a few examples, see https://software.intel.com/en-us/articles/controlling-process-placement-with-the-intel-mpi-library for more.  The article is aimed at Linux*, simply substitute mpiexec.hydra for mpirun to run in Windows*.

>mpiexec.hydra -n 2 -ppn 1 -hosts sysA,sysB Foo.exe
>type hosts.txt
sysA
sysB
>mpiexec.hydra -n 2 -ppn 1 -f hosts.txt Foo.exe
>type machines.txt
sysA:1
sysB:1
>mpiexec.hydra -n 2 -machinefile machines.txt Foo.exe

 

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,356 Views

The examples shows using mpiexec, not mpiexec.hydra.

My results:

C:\Program Files (x86)\IntelSWTools>mpiexec.hydra -n 16 -ppn 8 -hosts Thor,i72600k \Downloads\TestMPI_CPP.exe
Error connecting to the Service
[mpiexec@Thor] ..\hydra\utils\sock\sock.c (270): unable to connect from "Thor" to "i72600k" (No error)


C:\Program Files (x86)\IntelSWTools>mpiexec.hydra -n 16 -ppn 8 -hosts Thor,i72600k \Downloads\TestMPI_CPP.exe
Hello world: rank 0 of 16 running on Thor
Hello world: rank 1 of 16 running on Thor
Hello world: rank 2 of 16 running on Thor
Hello world: rank 3 of 16 running on Thor
Hello world: rank 4 of 16 running on Thor
Hello world: rank 5 of 16 running on Thor
Hello world: rank 6 of 16 running on Thor
Hello world: rank 7 of 16 running on Thor
Hello world: rank 8 of 16 running on i72600K
Hello world: rank 9 of 16 running on i72600K
Hello world: rank 10 of 16 running on i72600K
Hello world: rank 11 of 16 running on i72600K
Hello world: rank 12 of 16 running on i72600K
Hello world: rank 13 of 16 running on i72600K
Hello world: rank 14 of 16 running on i72600K
Hello world: rank 15 of 16 running on i72600K

Note, for some reason first attempt failed, second succeeded. Third attempt succeeded.

Any thoughts?

Jim Dempsey

0 Kudos
James_T_Intel
Moderator
1,356 Views

Do you have anything in the Windows Event Viewer indicating problems at the time of the failed run.

0 Kudos
jimdempseyatthecove
Honored Contributor III
1,356 Views

If that shows up again, I will look in the logs (both systems).

I think that may have occurred the first time after a re-boot of one of the systems. I haven't seen it since.

I may also have to re-enable the Firewall logging too.

Jim Dempsey

0 Kudos
Reply