Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Chunwang
Beginner
208 Views

Compile MPI version of AMBER20 use OneAPI, mpirun -np 4 pmemd.MPI got bus error

Hi, 

I'm trying to compile AMBER20 in a docker image (Ubuntu 18.04.5 LTS, with cuda 11 toolkits) using the latest oneAPI toolkits (l_BaseKit_p_2021.2.0.2883_offline.sh, l_HPCKit_p_2021.2.0.2997_offline.sh).

compile options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I can successfully compile the pmemd.MPI, but it can't properly run (mpiexec -np 2 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out).

The error is the following, it seems caused by mpirun/mpiexec, not pmemd.MPI

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 22283 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 22284 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

 

By the way, the compiled non-parallel version of pmemd can run properly, the compile options is: 

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=FALSE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I am not sure about what is wrong.  Can oneAPI be used in a docker image?

Anyone can give me an answer? Hoping for your reply. Thanks~

 

0 Kudos
5 Replies
Chunwang
Beginner
195 Views

I have tried to recompile pmemd.MPI using mpi library from mpich

tar zxf mpich-3.3.1.tar.gz
cd mpich-3.3.1
./configure --prefix=/usr/mpich
make -j 8
make install


then, remove the intel mpi path in $PATH and $LD_LIBRARY_PATH, and recompile use the following options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE -DCMAKE_PREFIX_PATH=/usr/mpich

the resulting pmemd.MPI can work properly with mpirun/mpiexec

ShivaniK_Intel
Moderator
166 Views

Hii,


Thanks for reaching out to us.


Could you please provide us the source code link?


Also, provide the details of I_MPI_DEBUG=30,-check-mpi by executing the below command


mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out


Thanks & Regards

Shivani


ShivaniK_Intel
Moderator
132 Views

Hi,


As we didn't hear back from you, Is your issue resolved? If not, please provide the details that have been asked in my previous post.


Thanks & Regards,

Shivani


Chunwang
Beginner
105 Views

Hi, 

 

Sorry,  I forget to login in to check the reply recently.

Currently, this issue has not been resolved. I just used oneAPI + mpich for a compromise.

 

source code link:

 
 
The output of the following command:
mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out
 
----------------------------------------------------------------------------------------------------------------------------------

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[0] MPI startup(): Intel(R) MPI Library, Version 2021.2 Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
impi_mbind_local(): mbind(p=0x7f8a98be0000, size=1073741824) error=1 "Operation not permitted"

impi_mbind_local(): mbind(p=0x7f2615ac6000, size=1073741824) error=1 "Operation not permitted"


===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 40 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 41 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

24 Views

Reply