Intel® oneAPI HPC Toolkit
Get help with building, analyzing, optimizing, and scaling high-performance computing (HPC) applications.
Announcements
The Intel sign-in experience is changing in February to support enhanced security controls. If you sign in, click here for more information.

MPI error while running VASP

ManishVCU
Beginner
702 Views

I have installed HPCKit_p_2022.3.1.16997 and onemkl_p_2022.2.1.16993 in my account and compiled VASP-6 with 

source /xxxxxxxxxx/setvars.sh

Note: We have license of VASP-6

While running in cluster I am often encountering the following error while submitting jobs with higher processor (lets say 16 processor):

 

Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........:
MPID_Init(1538)..............:
MPIDI_OFI_mpi_init_hook(1592):
MPIDU_bc_table_create(320)...: Missing hostname or invalid host/port description in business card
Abort(1615247) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Init: Other MPI error, error stack:
MPIR_Init_thread(176)........:
MPID_Init(1538)..............:
MPIDI_OFI_mpi_init_hook(1592):
MPIDU_bc_table_create(320)...: Missing hostname or invalid host/port description in business card

 

The job script I am using is

#!/bin/bash
#$ -cwd
#$-S /bin/bash
#
#$ -pe ompi 16
cd ./
source /home1/mohantamk/intel/oneapi/setvars.sh

/home1/mohantamk/intel/oneapi/mpi/latest/bin/mpirun -np 16 /home1/mohantamk/vasp6_myintel/vasp_6.3.2/vasp.6.3.2/bin/vasp_std

 

I wonder what is the source of this error. Any help will be appreciated.

0 Kudos
7 Replies
ShivaniK_Intel
Moderator
657 Views

Hi,


Thanks for posting in the Intel forums.


Could you please provide us with the OS and cluster details?


Thanks & Regards

Shivani


ManishVCU
Beginner
651 Views

Hi Shivani,

I want you to note that I am a user of TEAL cluster, VCU (Virginia Commonwealth University). The HPC admin are trying since 30 days and not able to resolve this issue.

VCU has licensed intel 2021 oneapi and installed in TEAL cluster but even I compiled with 2021 oneapi I got  the same error.

The job submission manager is "GridEngine".

 

I am attaching details of TEAL cluster of Virginia Commonwealth University.

Eagerly waiting for your reply and if possible I want you to look at this matter through remote access.  

 

Command: cat /etc/os-release

NAME="CentOS Linux"

VERSION="7 (Core)"

ID="centos"

ID_LIKE="rhel fedora"

VERSION_ID="7"

PRETTY_NAME="CentOS Linux 7 (Core)"

ANSI_COLOR="0;31"

CPE_NAME="cpe:/o:centos:centos:7"

HOME_URL="https://www.centos.org/"

BUG_REPORT_URL="https://bugs.centos.org/"

 

CENTOS_MANTISBT_PROJECT="CentOS-7"

CENTOS_MANTISBT_PROJECT_VERSION="7"

REDHAT_SUPPORT_PRODUCT="centos"

REDHAT_SUPPORT_PRODUCT_VERSION="7"

 

 

Command: lsb_release -a

LSB Version:    :core-4.1-amd64:core-4.1-noarch:cxx-4.1-amd64:cxx-4.1-noarch:desktop-4.1-amd6                                                                                    4:desktop-4.1-noarch:languages-4.1-amd64:languages-4.1-noarch:printing-4.1-amd64:printing-4.1                                                                                    -noarch

Distributor ID: CentOS

Description:    CentOS Linux release 7.9.2009 (Core)

Release:        7.9.2009

Codename:       Core

 

 

 

Command: hostnamectl

Static hostname: teal.chpc.vcu.edu

Icon name: computer-desktop

Chassis: desktop

Machine ID: 4ac7b5d1047f445cb7fff9b79d5aa702

Boot ID: 39ebd274dabd4f01962f8a6d08e9c0e4

Operating System: CentOS Linux 7 (Core)

CPE OS Name: cpe:/o:centos:centos:7

Kernel: Linux 3.10.0-1160.49.1.el7.x86_64

Architecture: x86-64

 

 

GridEngine Job script:

#!/bin/bash

#$ -cwd

#$-S /bin/bash

#

#$ -pe ompi 16

source /home1/mohantamk/intel/oneapi/setvars.sh

cd ./

echo "Started at: "

date

export NSLOTS=16

 

export MDRUN=/home1/mohantamk/vasp6_myintel/vasp_6.3.2/vasp.6.3.2/bin/vasp_std

 

export MPIRUN=/home1/mohantamk/intel/oneapi/mpi/latest/bin/mpirun

echo $NSLOTS

$MPIRUN  -np $NSLOTS $MDRUN

 

Thanks and Regards

Manish

Postdoc Research Associate

VCU

 

ShivaniK_Intel
Moderator
611 Views


Hi,


Could you please make a change in the GridEngine Job script and try running your application and let us know the output?


Replace


#$ -pe impi 16


Instead of


#$ -pe ompi 16


Thanks & Regards

Shivani


ManishVCU
Beginner
598 Views

Hi Shivani,

It says Unable to run job: job rejected: the requested parallel environment "impi" does not exist.

 

 

Thanks and Regards

Manish

ShivaniK_Intel
Moderator
544 Views

Hi,



>>>"I want you to note that I am a user of the TEAL cluster, at VCU (Virginia Commonwealth University). The HPC admin are trying for 30 days 

and not able to resolve this issue".


Your university could directly raise priority support so that we could solve your issue through remote access. They could use the below 

link.


https://www.intel.com/content/www/us/en/developer/get-help/priority-support.html


Could you please let us know whether you are able to run IMB benchmarks in your cluster or not?


Thanks & Regards

Shivani



ShivaniK_Intel
Moderator
502 Views

Hi,


As we didn't hear back from you, Could you please provide the details that have been asked in my previous post so that we can investigate more on your issue?


Thanks & Regards

Shivani


ShivaniK_Intel
Moderator
120 Views

Hi,


We have not heard back from you. This thread will no longer be monitored by Intel. If you need further assistance please post a new question.


Thanks & Regards

Shivani


Reply