Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
Welcome to the Intel Community. If you get an answer you like, please mark it as an Accepted Solution to help others. Thank you!
1828 Discussions

No route to host, Network is unreachable

lmh
Beginner
279 Views

I recently made my own cluster server and tried to use intel mpi.. and had some troubles.

My cluster server has 8 nodes and each with intel-i9(Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz) connected via wired hub.

 

To check if it’s working, I wrote the following code:

 

//hello_mpi.c

#include <mpi.h>
#include <stdio.h>

int main(int argc, char** argv)
{
          MPI_Init(NULL,NULL);
          int world_size;
          MPI_Comm_size(MPI_COMM_WORLD, &world_size);

          int world_rank;
          MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);

          char processor_name[MPI_MAX_PROCESSOR_NAME];
          int name_len;
          MPI_Get_processor_name(processor_name, &name_len);

          printf("Hello world from processor %s, rank %d out of %d processors\n", processor_name, world_rank, world_size);
          MPI_Finalize();

}

 

 

and I compile the code with this command:

mpiicc ./hello_mpi.c -o hello_mpi

 

and it works.

 

When I run this code with this command:

mpirun ./hello_mpi

 

It prints out the results:

Hello world from processor node0, rank 2 out of 10 processors
Hello world from processor node0, rank 7 out of 10 processors
Hello world from processor node0, rank 1 out of 10 processors
Hello world from processor node0, rank 3 out of 10 processors
Hello world from processor node0, rank 4 out of 10 processors
Hello world from processor node0, rank 5 out of 10 processors
Hello world from processor node0, rank 9 out of 10 processors
Hello world from processor node0, rank 6 out of 10 processors
Hello world from processor node0, rank 8 out of 10 processors
Hello world from processor node0, rank 0 out of 10 processors

 

 

But, when I try to check the same code using multi- nodes with this command:

mpirun -host node0,node1,node2,node3,node4,node5,node6,node7  ./hello_mpi

 

It prints out an error message:

Hello world from processor node0, rank 2 out of 80 processors

....

....

 

Other MPI error, error stack:
PMPI_Finalize(214)...............: MPI_Finalize failed
PMPI_Finalize(159)...............:
MPID_Finalize(1280)..............:
MPIDI_OFI_mpi_finalize_hook(1807):
MPIR_Reduce_intra_binomial(142)..:
MPIC_Send(131)...................:
MPID_Send(771)...................:
MPIDI_send_unsafe(220)...........:
MPIDI_OFI_send_normal(398).......:
MPIDI_OFI_send_handler_vci(647)..: OFI tagged send failed (ofi_impl.h:647:MPIDI_OFI_send_handler_vci:No route to host)
Hello world from processor node5, rank 59 out of 80 processors
Abort(810115343) on node 10 (rank 10 in comm 0): Fatal error in PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(214)...............: MPI_Finalize failed
PMPI_Finalize(159)...............:
MPID_Finalize(1280)..............:
MPIDI_OFI_mpi_finalize_hook(1807):
MPIR_Reduce_intra_binomial(142)..:
MPIC_Send(131)...................:
MPID_Send(771)...................:
MPIDI_send_unsafe(220)...........:
MPIDI_OFI_send_normal(398).......:
MPIDI_OFI_send_handler_vci(647)..: OFI tagged send failed (ofi_impl.h:647:MPIDI_OFI_send_handler_vci:Network is unreachable)
Hello world from processor node5, rank 56 out of 80 processors
Abort(810115343) on node 30 (rank 30 in comm 0): Fatal error in PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(214)...............: MPI_Finalize failed
PMPI_Finalize(159)...............:
MPID_Finalize(1280)..............:
MPIDI_OFI_mpi_finalize_hook(1807):
MPIR_Reduce_intra_binomial(142)..:
MPIC_Send(131)...................:
MPID_Send(771)...................:
MPIDI_send_unsafe(220)...........:
MPIDI_OFI_send_normal(398).......:
MPIDI_OFI_send_handler_vci(647)..: OFI tagged send failed (ofi_impl.h:647:MPIDI_OFI_send_handler_vci:No route to host)
Abort(810115343) on node 50 (rank 50 in comm 0): Fatal error in PMPI_Finalize: Other MPI error, error stack:
PMPI_Finalize(214)...............: MPI_Finalize failed
PMPI_Finalize(159)...............:
MPID_Finalize(1280)..............:
MPIDI_OFI_mpi_finalize_hook(1807):
MPIR_Reduce_intra_binomial(142)..:
MPIC_Send(131)...................:
MPID_Send(771)...................:
MPIDI_send_unsafe(220)...........:
MPIDI_OFI_send_normal(398).......:
MPIDI_OFI_send_handler_vci(647)..: OFI tagged send failed (ofi_impl.h:647:MPIDI_OFI_send_handler_vci:No route to host)

 

Strangely, when I run the code with just node3 and node4, it works!(without any error message).

 

I wonder if anyone has any insight into what the problem might be?

Thank you very much for your help.

 

 

Labels (1)
0 Kudos
2 Replies
AbhishekD_Intel
Moderator
253 Views

Hi,


Thanks for reaching out to us.

We can see that there are two same threads from you in this forum. So will no longer monitor this thread. Please expect a reply from the other thread. Follow the below link for more details.

https://community.intel.com/t5/Intel-oneAPI-HPC-Toolkit/mpi-machine-file-and-host-command-do-not-wor...


We will look into the issue and will get back to you as soon as possible.



Warm Regards,

Abhishek


lmh
Beginner
247 Views

The link that you mention is the thread that I wrote before. Because I worry about the mis-understanding, I wrote this thread again with some progress..

Reply