I have two Nodes, both with Windows XP and Intel MPI Runtime Environment installed. -smpd is running on both nodes -the nodes are connected vie lan, both in the network 192.168.145.* -all firewalls are disabled, pinging works I've got a simple pinpong-program on both the nodes, when I execute the program with two processes on one node, it works fine, but executing it with one process per node doesn't work, because it deadlocks in the first cummunication(in my case a send and a recv). What does work is the MPI_Init and the transfer of stdout from both nodes to the node where mpiexec was run. I've set I_MPI_DEVICE to sock. Here are the two commands, both run on 192.168.145.132: working: mpiexec -hosts 2 192.168.145.132 192.168.145.132 -env I_MPI_DEVICE sock "C:\\pingpong.exe" not-working: mpiexec -hosts 2 192.168.145.132 192.168.145.134 -env
I_MPI_DEVICE sock "C:\\pingpong.exe" So the problem seems to be the communikation between the two nodes, but I don't know why. Does anyone have an Idea what the problem is?
What is the library version? If you are using Intel MPI Library 4.0 you'd better use "I_MPI_FABRICS tcp" instead of "I_MPI_DEVICE sock".
Could you also check library versions in the following runs: mpiexec -hosts 1 192.168.145.132 2 -genv
I_MPI_DEBUG 9 "C:\pingpong.exe" mpiexec -hosts 1 192.168.145.134 2 -genv
I_MPI_DEBUG 9 "C:\pingpong.exe"
The library version is 4.0.0.012, but using fabrics instead of device doesn't change the locking behavior. With the specified runs it says "Intel MPI Library, Version 4.0 Build 20100218" on both nodes.
Just one notice: Using "-genv I_MPI_FABRICS tcp" means that shared memory will not be used. To improve performance either use default settings (don't use I_MPI_FABRICS) or set "-genv I_MPI_FABRICS shm:tcp"