Intel® MPI Library
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2166 Discussions

MPI Fortran program hangs at MPI_RECV hangs

Niyas
Beginner
1,215 Views

Hello,

I'm encountering an issue with a large, in-house Fortran MPI code where the program consistently hangs at the MPI_RECV call.

Unfortunately, due to the code's sensitivity and the limitations of this public forum, I'm unable to share the entire code or create a simplified version for demonstration purposes.

Despite these constraints, I would be grateful if you could offer any suggestions or troubleshooting steps that might help me identify and resolve the hanging issue.

Thank you for your time and assistance.

 

0 Kudos
7 Replies
TobiasK
Moderator
1,193 Views

Hello @Niyas 
you might want to consider priority support where we offer a direct channel to Intel engineers and also a clear way how to share confidential data.
https://www.intel.com/content/www/us/en/developer/tools/oneapi/support.html

Without a reproducer or anything to work with, the only advice I can give you is to run with -check_mpi enabled. This should help you to identify some coding errors.

0 Kudos
Niyas
Beginner
1,172 Views

Hi @TobiasK 

Thanks for your reply. 

 

Currently, I using the following script file for MPI execution. 

#!/bin/bash

#PBS -N RDE3D04MPI
#PBS -o /home/..../Result_out.out
#PBS -e /home/..../Result_err.out
#PBS -l nodes=lc601:ppn=36+lc602:ppn=36+lc603:ppn=36+lc604:ppn=33

cd /home/..../06RPL3DRDEoldMPI
mpiexec -np 141 ./RPL3DRDEMPI.exe

In this file, where should I add -check_mpi?  

0 Kudos
Niyas
Beginner
1,097 Views

@TobiasK I tried exactly the following lines

 

mpiexec -check_mpi -np 141 ./RPL3DRDEMPI.exe

But, I get the error as the following file, 

 

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded: ignored.

Please find the attached. 

I turned on tracking by adding "export VT_CHECK_TRACING=on" in the bashrc as follows but still the error persist, 

##intel compiler
source /opt/intel/oneapi/compiler/latest/env/vars.sh
source /opt/intel/oneapi/mpi/2021.5.1/env/vars.sh
export I_MPI_HYDRA_BOOTSTRAP=ssh

export VT_CHECK_TRACING=on

 

 

 

0 Kudos
TobiasK
Moderator
1,044 Views
0 Kudos
Niyas
Beginner
1,024 Views

@TobiasK Thanks for the link. 

 

Can I directly install a trace analyzer in the HPC cluster using the link you provided? 

We are using Linux-CentOS.  

 

 

0 Kudos
TobiasK
Moderator
1,007 Views

@Niyas CentOS is not supported by our current release.
https://www.intel.com/content/www/us/en/developer/articles/release-notes/intel-trace-analyzer-and-collector-release-notes-linux.html

 

  • Operating systems:
    • Amazon Linux 2, 2022
    • Debian* 11.x
    • Fedora* 37, 38
    • Rocky 9
    • Red Hat Enterprise Linux* 8.x, 9.x
    • SUSE Linux Enterprise Server* 15SP3, 15SP4
    • Ubuntu* 20.04, 22.04

You may still try install the package either as part of the HPC kit or as standalone available here:

https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#trace

However, since you are running on an unsupported OS, I cannot help you if something is not working.

 

0 Kudos
Reply