- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm using IntelMpi 2021.5.1 on RedHat 8.5 with an NFS file system. My simple Hello World program fails with many error messages. Any help would be appreciated.
Here is the program:
int main(int argc, char *argv[])
{
int rank, world_size, error_codes[1];
char hostname[128];
MPI_Comm intercom;
MPI_Info info;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &world_size);
gethostname(hostname, 127);
std::cout << "Hello from process on " << hostname << std::endl;
MPI_Finalize();
}
I sourced /opt/intel/oneapi/setvars.sh before building the executable, and
here is the run command:
$ export I_MPI_PIN_RESPECT_CPUSET=0; mpirun ./parent_simple
Here are the abridged error messages. I eliminated many repetitions:
[1646934040.094426] [rocci:306332:0] ib_verbs.h:84 UCX ERROR ibv_exp_query_device(mlx5_0) returned 95: Operation not supported
[1646934040.113276] [rocci:306320:0] select.c:434 UCX ERROR no active messages transport to <no debug data>: self/memory - Destination is unreachable, rdmacm/sockaddr - no am bcopy
Abort(1090703) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(143)........:
MPID_Init(1310)..............:
MPIDI_OFI_mpi_init_hook(1974): OFI get address vector map failed
[1646934040.113302] [rocci:306315:0] select.c:434 UCX ERROR no active messages transport to <no debug data>: self/memory - Destination is unreachable, rdmacm/sockaddr - no am bcopy
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for posting in Intel Communities.
We are unable to reproduce your issue at our end. We tried with your sample reproducer code and we were able to get the expected results.
We followed the below steps using the latest Intel MPI 2021.5 on a Linux machine:
1. Please find the below command:
For Compiling, use the below command:
mpiicc -o hello hello.cpp
For Running the MPI program, use the below command:
export I_MPI_PIN_RESPECT_CPUSET=0;mpirun -bootstrap ssh -n 1 -ppn 1 ./hello
Could you please provide us with the OS details and FI provider you are using?
Please run the below command for cluster checker and share with us the complete log file.
clck -f ./<nodefile> -F mpi_prereq_user
Please find the attached screenshot for the expected results.
Thanks & Regards,
Varsha
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kurt,
Have you run the cluster checker as Varsha suggested? Could you please also run the following items and share with us the detailed results including the complete log files?
- share the output of ucx_info -d and fi_info -v
- run the code by enabling debug options using I_MPI_DEBUG=10 and FI_LOG_LEVEL=debug
- run the code with tcp as your OFI* provider (FI_PROVIDER=tcp ) and enable debug options
Thanks,
Xiao
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Kurt,
We did not heard back from you for the additional information and will close this thread. If you require additional assistance from Intel, please start a new thread. Any further interaction in this thread will be considered community only.
Best,
Xiao
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page