Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
1890 Discussions

Compile MPI version of AMBER20 use OneAPI, mpirun -np 4 pmemd.MPI got bus error

Chunwang
Beginner
714 Views

Hi, 

I'm trying to compile AMBER20 in a docker image (Ubuntu 18.04.5 LTS, with cuda 11 toolkits) using the latest oneAPI toolkits (l_BaseKit_p_2021.2.0.2883_offline.sh, l_HPCKit_p_2021.2.0.2997_offline.sh).

compile options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I can successfully compile the pmemd.MPI, but it can't properly run (mpiexec -np 2 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out).

The error is the following, it seems caused by mpirun/mpiexec, not pmemd.MPI

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 22283 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 22284 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

 

By the way, the compiled non-parallel version of pmemd can run properly, the compile options is: 

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=FALSE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I am not sure about what is wrong.  Can oneAPI be used in a docker image?

Anyone can give me an answer? Hoping for your reply. Thanks~

 

0 Kudos
6 Replies
Chunwang
Beginner
701 Views

I have tried to recompile pmemd.MPI using mpi library from mpich

tar zxf mpich-3.3.1.tar.gz
cd mpich-3.3.1
./configure --prefix=/usr/mpich
make -j 8
make install


then, remove the intel mpi path in $PATH and $LD_LIBRARY_PATH, and recompile use the following options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE -DCMAKE_PREFIX_PATH=/usr/mpich

the resulting pmemd.MPI can work properly with mpirun/mpiexec

ShivaniK_Intel
Moderator
672 Views

Hii,


Thanks for reaching out to us.


Could you please provide us the source code link?


Also, provide the details of I_MPI_DEBUG=30,-check-mpi by executing the below command


mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out


Thanks & Regards

Shivani


ShivaniK_Intel
Moderator
638 Views

Hi,


As we didn't hear back from you, Is your issue resolved? If not, please provide the details that have been asked in my previous post.


Thanks & Regards,

Shivani


Chunwang
Beginner
611 Views

Hi, 

 

Sorry,  I forget to login in to check the reply recently.

Currently, this issue has not been resolved. I just used oneAPI + mpich for a compromise.

 

source code link:

 
 
The output of the following command:
mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out
 
----------------------------------------------------------------------------------------------------------------------------------

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[0] MPI startup(): Intel(R) MPI Library, Version 2021.2 Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
impi_mbind_local(): mbind(p=0x7f8a98be0000, size=1073741824) error=1 "Operation not permitted"

impi_mbind_local(): mbind(p=0x7f2615ac6000, size=1073741824) error=1 "Operation not permitted"


===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 40 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 41 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

Klaus-Dieter_O_Intel
530 Views
Chunwang
Beginner
499 Views

Hi 

Thanks for your reply! I tried as you said, but it seems to has no work.

I run the docker use the following command:

docker run --rm --security-opt seccomp=unconfined -it amber20_intel:latest bash

And then executed the following command:

mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out

I just got the same error:

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[0] MPI startup(): Intel(R) MPI Library, Version 2021.2  Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
impi_mbind_local(): mbind(p=0x7fa7dff0e000, size=1073741824) error=1 "Operation not permitted"

impi_mbind_local(): mbind(p=0x7f1b316bb000, size=1073741824) error=1 "Operation not permitted"


===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 49 RUNNING AT 7ecc1fb746c4
=   KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 1 PID 50 RUNNING AT 7ecc1fb746c4
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

 

Reply