Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.

mpirun 2019 hangs

yzhao
Beginner
532 Views

My program depending on MPI 2019 hangs in communication. I'm testing on 3 machines, two with RHEL7.6, one with RHEL7.

 

In testing, I got rid of external dependencies, and only used MKL examples. When testing pblas examples, all examples yield correct result. However, cluster_sparse_solver examples fail to run on multiple machines.

For local testing, i use cmdline: mpirun -n 3 ./cl_solver_unsym_c

For multi-node testing, the cmdline is: mpirun -n 3 -ppn 1 -hosts host1,host2,host3 ./cl_solver_unsym_c

 

When using FI_PROVIDER=sockets, I get the printings the program hangs in the callstack:

P1.pngP2.png

When using FI_PROVIDER=tcp, no print from MKL is observed and program hangs in the provided callstack.

P4.pngP3.png

0 Kudos
1 Reply
TobiasK
Moderator
449 Views

@yzhao sorry, but I can only provide help with the latest version of Intel MPI, not 2019. Please try to use the latest version available, 2021.14.1.

0 Kudos
Reply