Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
FPGA community forums and blogs on community.intel.com are migrating to the new Altera Community and are read-only. For urgent support needs during this transition, please visit the FPGA Design Resources page or contact an Altera Authorized Distributor.

mpirun 2019 hangs

yzhao
Beginner
579 Views

My program depending on MPI 2019 hangs in communication. I'm testing on 3 machines, two with RHEL7.6, one with RHEL7.

 

In testing, I got rid of external dependencies, and only used MKL examples. When testing pblas examples, all examples yield correct result. However, cluster_sparse_solver examples fail to run on multiple machines.

For local testing, i use cmdline: mpirun -n 3 ./cl_solver_unsym_c

For multi-node testing, the cmdline is: mpirun -n 3 -ppn 1 -hosts host1,host2,host3 ./cl_solver_unsym_c

 

When using FI_PROVIDER=sockets, I get the printings the program hangs in the callstack:

P1.pngP2.png

When using FI_PROVIDER=tcp, no print from MKL is observed and program hangs in the provided callstack.

P4.pngP3.png

0 Kudos
1 Reply
TobiasK
Moderator
496 Views

@yzhao sorry, but I can only provide help with the latest version of Intel MPI, not 2019. Please try to use the latest version available, 2021.14.1.

0 Kudos
Reply