Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
2058 Discussions

Compile MPI version of AMBER20 use OneAPI, mpirun -np 4 pmemd.MPI got bus error

Chunwang
Beginner
1,468 Views

Hi, 

I'm trying to compile AMBER20 in a docker image (Ubuntu 18.04.5 LTS, with cuda 11 toolkits) using the latest oneAPI toolkits (l_BaseKit_p_2021.2.0.2883_offline.sh, l_HPCKit_p_2021.2.0.2997_offline.sh).

compile options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I can successfully compile the pmemd.MPI, but it can't properly run (mpiexec -np 2 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out).

The error is the following, it seems caused by mpirun/mpiexec, not pmemd.MPI

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 22283 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 9 (Killed)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 22284 RUNNING AT beca86746e4d
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

 

By the way, the compiled non-parallel version of pmemd can run properly, the compile options is: 

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=FALSE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE

 

I am not sure about what is wrong.  Can oneAPI be used in a docker image?

Anyone can give me an answer? Hoping for your reply. Thanks~

 

0 Kudos
8 Replies
Chunwang
Beginner
1,455 Views

I have tried to recompile pmemd.MPI using mpi library from mpich

tar zxf mpich-3.3.1.tar.gz
cd mpich-3.3.1
./configure --prefix=/usr/mpich
make -j 8
make install


then, remove the intel mpi path in $PATH and $LD_LIBRARY_PATH, and recompile use the following options:

cmake .. -DCMAKE_INSTALL_PREFIX=/usr/amber20 -DCOMPILER=INTEL -DFORCE_EXTERNAL_LIBS=mkl -DMPI=TRUE -DOPENMP=FALSE -DCUDA=FALSE -DBUILD_GUI=FALSE -DINSTALL_TESTS=FALSE -DDOWNLOAD_MINICONDA=TRUE -DCHECK_UPDATES=FALSE -DCMAKE_PREFIX_PATH=/usr/mpich

the resulting pmemd.MPI can work properly with mpirun/mpiexec

0 Kudos
ShivaniK_Intel
Moderator
1,426 Views

Hii,


Thanks for reaching out to us.


Could you please provide us the source code link?


Also, provide the details of I_MPI_DEBUG=30,-check-mpi by executing the below command


mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out


Thanks & Regards

Shivani


0 Kudos
ShivaniK_Intel
Moderator
1,392 Views

Hi,


As we didn't hear back from you, Is your issue resolved? If not, please provide the details that have been asked in my previous post.


Thanks & Regards,

Shivani


0 Kudos
Chunwang
Beginner
1,365 Views

Hi, 

 

Sorry,  I forget to login in to check the reply recently.

Currently, this issue has not been resolved. I just used oneAPI + mpich for a compromise.

 

source code link:

 
 
The output of the following command:
mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out
 
----------------------------------------------------------------------------------------------------------------------------------

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[0] MPI startup(): Intel(R) MPI Library, Version 2021.2 Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation. All rights reserved.
[0] MPI startup(): library kind: release
impi_mbind_local(): mbind(p=0x7f8a98be0000, size=1073741824) error=1 "Operation not permitted"

impi_mbind_local(): mbind(p=0x7f2615ac6000, size=1073741824) error=1 "Operation not permitted"


===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 0 PID 40 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= RANK 1 PID 41 RUNNING AT 68fa0e48402e
= KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

0 Kudos
Klaus-Dieter_O_Intel
1,284 Views
0 Kudos
Chunwang
Beginner
1,253 Views

Hi 

Thanks for your reply! I tried as you said, but it seems to has no work.

I run the docker use the following command:

docker run --rm --security-opt seccomp=unconfined -it amber20_intel:latest bash

And then executed the following command:

mpiexec -np 2 -check_mpi -env I_MPI_DEBUG=30 pmemd.MPI -i press.in -p force.parm7 -c press.rst7 -ref press.rst7 -O -o test.out

I just got the same error:

ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
ERROR: ld.so: object 'libVTmc.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
[0] MPI startup(): Intel(R) MPI Library, Version 2021.2  Build 20210302 (id: f4f7c92cd)
[0] MPI startup(): Copyright (C) 2003-2021 Intel Corporation.  All rights reserved.
[0] MPI startup(): library kind: release
impi_mbind_local(): mbind(p=0x7fa7dff0e000, size=1073741824) error=1 "Operation not permitted"

impi_mbind_local(): mbind(p=0x7f1b316bb000, size=1073741824) error=1 "Operation not permitted"


===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 0 PID 49 RUNNING AT 7ecc1fb746c4
=   KILLED BY SIGNAL: 7 (Bus error)
===================================================================================

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   RANK 1 PID 50 RUNNING AT 7ecc1fb746c4
=   KILLED BY SIGNAL: 9 (Killed)
===================================================================================

 

0 Kudos
Klaus-Dieter_O_Intel
553 Views

Presumably the issue is related to docker limitations on /dev/shm. Please try the solution described at https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/troubleshooting/problem-mpi-limitation-for-docker.html


0 Kudos
ShivaniK_Intel
Moderator
515 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel.

If you need further assistance please post a new question.


Thanks & Regards

Shivani


0 Kudos
Reply