Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

MPI Hello World fails

kmccall882
Beginner
3,178 Views

I'm using IntelMpi 2021.5.1 on RedHat 8.5 with an NFS file system.   My simple Hello World program fails with many error messages.  Any help would be appreciated.

 

Here is the program:

int main(int argc, char *argv[])
{
    int rank, world_size, error_codes[1];
    char hostname[128];
    MPI_Comm intercom;
    MPI_Info info;

    MPI_Init(&argc, &argv);

    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &world_size);

    gethostname(hostname, 127);

    std::cout << "Hello from process on " << hostname << std::endl;

    MPI_Finalize();
}

 

I sourced /opt/intel/oneapi/setvars.sh before building the executable, and

here is the run command:

$ export I_MPI_PIN_RESPECT_CPUSET=0; mpirun ./parent_simple

 

Here are the abridged error messages.   I eliminated many repetitions:

 

[1646934040.094426] [rocci:306332:0] ib_verbs.h:84 UCX ERROR ibv_exp_query_device(mlx5_0) returned 95: Operation not supported

 

[1646934040.113276] [rocci:306320:0] select.c:434 UCX ERROR no active messages transport to <no debug data>: self/memory - Destination is unreachable, rdmacm/sockaddr - no am bcopy

Abort(1090703) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(143)........:
MPID_Init(1310)..............:
MPIDI_OFI_mpi_init_hook(1974): OFI get address vector map failed
[1646934040.113302] [rocci:306315:0] select.c:434 UCX ERROR no active messages transport to <no debug data>: self/memory - Destination is unreachable, rdmacm/sockaddr - no am bcopy

 

 

Labels (1)
0 Kudos
3 Replies
VarshaS_Intel
Moderator
3,137 Views

Hi,

 

Thanks for posting in Intel Communities.

 

We are unable to reproduce your issue at our end. We tried with your sample reproducer code and we were able to get the expected results.

 

We followed the below steps using the latest Intel MPI 2021.5 on a Linux machine:

 

1. Please find the below command:

 

For Compiling, use the below command:
mpiicc -o hello hello.cpp
For Running the MPI program, use the below command:
export I_MPI_PIN_RESPECT_CPUSET=0;mpirun -bootstrap ssh -n 1 -ppn 1 ./hello

 

 

Could you please provide us with the OS details and FI provider you are using?

 

Please run the below command for cluster checker and share with us the complete log file.

 

clck -f ./<nodefile> -F mpi_prereq_user

 

Please find the attached screenshot for the expected results.

 

Thanks & Regards,

Varsha

 

 

0 Kudos
Xiao_Z_Intel
Employee
3,024 Views

Hi Kurt,

 

Have you run the cluster checker as Varsha suggested? Could you please also run the following items and share with us the detailed results including the complete log files?

 

  1. share the output of ucx_info -d and fi_info -v
  2. run the code by enabling debug options using I_MPI_DEBUG=10 and FI_LOG_LEVEL=debug
  3. run the code with tcp as your OFI* provider (FI_PROVIDER=tcp ) and enable debug options

 

Thanks,

Xiao

 

0 Kudos
Xiao_Z_Intel
Employee
2,948 Views

Hi Kurt,


We did not heard back from you for the additional information and will close this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.


Best,

Xiao


0 Kudos
Reply