Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2228 Discussions

Unable to run mpi across two hosts

Kylynn
Beginner
2,629 Views

Here is my issue.

I got two machines(windows), they have the same windows username and password, and I have done this:

         a.  installed the intel mpi library on each of them.

         b.  run setvars.bat

         c.  do hydra_service -intall  & -start

         d.  do mpiexec -register

Then I want to run the test mpi program:

1. when I run mpiexec -validate, both hosts show success :

Kylynn_0-1649662078806.png

2. when I run mpi test program on each of them , both success separately:

Kylynn_1-1649662175401.pngKylynn_2-1649662192999.png

(the log is different, but the file and its file path is the same)

3. the problem is , when I run mpi across the two hosts, it just hangs  :

Kylynn_3-1649662303590.png

4. when I open the debug, it shows :

Kylynn_4-1649662356072.png

 

It seem that MPI init is not complete, and just stuck there.  what can I do for that ? 

help ~~~~

0 Kudos
8 Replies
Kylynn
Beginner
2,607 Views

when I compare the debug info with the successful one ( run on just one host), it seems that the hangs one stuck in MPI startup(), like here:

Kylynn_0-1649674001772.png

 

0 Kudos
VarshaS_Intel
Moderator
2,578 Views

Hi,

 

Thanks for posting in Intel Communities.

 

Could you please let us know if you are able to run the below command?

mpiexec -n 2 -ppn 1 -hosts host1,host2 hostname

 

Could you please provide us with the cluster details you are using? And also, could you please confirm whether the firewall is disabled across all the nodes in the cluster?

 

We recommend you go through the below link. As it is not an updated article, you can ignore the product's versions.

 

https://www.intel.com/content/www/us/en/developer/articles/training/micro-cluster-setup-with-intel-mpi-a-stepwise-guide-to-setting-up-the-smallest-windows.html

 

Please make sure that you are following all the guidelines mentioned in the above link.

 

Thanks & Regards,

Varsha

 

0 Kudos
Kylynn
Beginner
2,558 Views

Thank you for your reply (^ ^)~

Kylynn_0-1649841261005.png

For this command , it runs fine.

The article you mentioned I have read it before,  and I actually do as it says.  

Here is my updated situation:

1. I have two computers with IP 192.168.2.245  and 192.168.2.141. They are running Windows OS, and have the same username and password.

2. I have installed the Intel mpi library on both computers, and start the hydra_service ,  do mpiexec -register .

3. Then I run the test code which in the Intel mpi library I downloaded.  The content of the test code is that rank 0 is receiving the messages, and the other processes are sending messages. As you know , the hello world test code .

4. But it just hangs like I posted before, today I have done some modifications about the test code. I make rank 0 send messages, the other processes receive the messages.  Then It can run , give the right output about hello world , But , the new problem is , it hangs in the MPI_Finalize(). 

 

But when I do this on the other two computers which are running Linux OS, it works fine, so the problem only appears on the Windows OS.  ~~o(T.T)o~~ 

Would you please give me some tips about it , Thank you !

0 Kudos
VarshaS_Intel
Moderator
2,507 Views

Hi,

 

Thanks for providing the information.

 

Could you please provide us with the results of the "-check_mpi" flag by using the below command?

 

mpiicc -check_mpi <program name.cpp> -o <program name>
mpiexec -n 2 ./<program name>

Could you please confirm whether you have followed Step 7(having the shared working directory) in this link?

 

Could you please let us know how you have connected the windows systems? And also, could you please provide us with the sample reproducer code to investigate more on your issue?

 

Thanks & Regards,

Varsha

 

0 Kudos
VarshaS_Intel
Moderator
2,472 Views

Hi,


We have not heard back from you. Could you please provide us with the details mentioned in the previous reply?


Thanks & Regards,

Varsha


0 Kudos
Kylynn
Beginner
2,456 Views

Dear Varsha ~

 

Recently I find other two machines(windows), and do these step on them like before. The result is , it works! MPI can run across them. But I still don't understand why it can not work on my computer.

 

Anyway, Thank you for your help!   (^.^)

       

0 Kudos
VarshaS_Intel
Moderator
2,425 Views

Hi,


Thanks for the update.


If you need further assistance from us, could you please provide us with the details mentioned in the previous post? If not, can we go ahead and close this thread from our end?


Thanks & Regards,

Varsha


0 Kudos
VarshaS_Intel
Moderator
2,411 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel, If you need further assistance, please post a new question.


Thanks & Regards,

Varsha


0 Kudos
Reply