Hi Paul,

Paul_Fons · ‎05-14-2013

I am running a large program called Vienna ab-initio Simulation Program (VASP) under parallel studio and intel mpi. I have compiled the program without problem and it runs apparenly correctly on all of the examples and produces correct results when run under mpi, however, a slightly larger job, which is what I bought the program for repeatedly crashes with an PMPI_Allgatherv error. As no other users of (this fairly widely used program) report similar errors, I am concerned that it is either an intel mpi bug. Another possibility is that it is a linking error on my part, hence, I have enclosed the makefile I used for building VASP for reference. Any help would be greatly appreciated.

Best wishes,

Paul Fons

The crash traceback is below:

mpirun -np 16 vasp_gamma
running on 16 total cores
distrk: each k-point on 16 cores, 1 groups
distr: one band on 8 cores, 2 groups
using from now: INCAR
vasp.5.3.3 18Dez12 (build May 13 2013 15:17:23) gamma-only

POSCAR found : 3 types and 108 ions
scaLAPACK will be used
LDA part: xc-table for Pade appr. of Perdew

POSCAR, INCAR and KPOINTS ok, starting setup
WARNING: small aliasing (wrap around) errors must be expected
FFT: planning ...
WAVECAR not read
WARNING: random wavefunctions but no delay for mixing, default for NELMDL
prediction of wavefunctions initialized - no I/O
entering main loop
N E dE d eps ncg rms rms(c)
RMM: 1 0.438327604116E+04 0.43833E+04 -0.94880E+04 346 0.506E+02
RMM: 2 0.127584402674E+04 -0.31074E+04 -0.31882E+04 346 0.152E+02
RMM: 3 0.299986400145E+03 -0.97586E+03 -0.12137E+04 346 0.925E+01
RMM: 4 -0.606823381784E+02 -0.36067E+03 -0.41045E+03 346 0.659E+01
RMM: 5 -0.228060471875E+03 -0.16738E+03 -0.15696E+03 346 0.362E+01
RMM: 6 -0.308261614897E+03 -0.80201E+02 -0.68222E+02 346 0.255E+01
RMM: 7 -0.345624028023E+03 -0.37362E+02 -0.34195E+02 346 0.155E+01
RMM: 8 -0.366489442688E+03 -0.20865E+02 -0.18421E+02 346 0.116E+01
RMM: 9 -0.391069482656E+03 -0.24580E+02 -0.24063E+02 842 0.755E+00
RMM: 10 -0.392923752147E+03 -0.18543E+01 -0.32951E+01 884 0.200E+00
RMM: 11 -0.393354485197E+03 -0.43073E+00 -0.41138E+00 833 0.520E-01
RMM: 12 -0.393407181438E+03 -0.52696E-01 -0.49653E-01 802 0.148E-01 0.950E+00
RMM: 13 -0.390937720994E+03 0.24695E+01 -0.45304E+00 697 0.142E+00 0.601E+00
RMM: 14 -0.390311322970E+03 0.62640E+00 -0.37173E+00 716 0.136E+00 0.276E+00
RMM: 15 -0.390294966246E+03 0.16357E-01 -0.96176E-01 783 0.688E-01 0.135E+00
RMM: 16 -0.390280817461E+03 0.14149E-01 -0.18087E-01 700 0.375E-01 0.504E-01
RMM: 17 -0.390284202366E+03 -0.33849E-02 -0.26515E-02 722 0.159E-01 0.268E-01
RMM: 18 -0.390287549580E+03 -0.33472E-02 -0.92233E-03 724 0.899E-02 0.158E-01
RMM: 19 -0.390289808941E+03 -0.22594E-02 -0.63338E-03 696 0.761E-02 0.924E-02
RMM: 20 -0.390290458142E+03 -0.64920E-03 -0.14378E-03 695 0.422E-02 0.458E-02
RMM: 21 -0.390290916003E+03 -0.45786E-03 -0.10433E-03 599 0.298E-02 0.274E-02
RMM: 22 -0.390290971931E+03 -0.55928E-04 -0.23374E-04 438 0.159E-02
1 T= 600. E= -.38199243E+03 F= -.39029097E+03 E0= -.39022511E+03 EK= 0.82985E+01 SP= 0.00E+00 SK= 0.00E+00
bond charge predicted
N E dE d eps ncg rms rms(c)
RMM: 1 -0.390242713772E+03 0.48202E-01 -0.45969E+00 692 0.226E+00 0.275E-01
RMM: 2 -0.390240986761E+03 0.17270E-02 -0.10085E-01 751 0.279E-01 0.152E-01
RMM: 3 -0.390241059805E+03 -0.73044E-04 -0.90892E-03 811 0.726E-02 0.953E-02
RMM: 4 -0.390240906267E+03 0.15354E-03 -0.10114E-03 675 0.269E-02 0.464E-02
RMM: 5 -0.390240893714E+03 0.12552E-04 -0.28526E-04 438 0.162E-02
2 T= 596. E= -.38199224E+03 F= -.39024089E+03 E0= -.39017463E+03 EK= 0.82487E+01 SP= 0.00E+00 SK= 0.00E+00
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x8051d70, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x8051d70, rcounts=0x75fc7d0, displs=0x77e78f0, MPI_DOUBLE_PRECISION, comm=0xc4010000) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x8051d70 src=0x8051d70 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80507c0, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x8038d50, rcounts=0x763d1c0, displs=0x763d210, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80507c0 src=0x80507c0 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80875a0, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x8040650, rcounts=0x75ea8f0, displs=0x77a1770, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80875a0 src=0x80875a0 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80b7fa0, scount=11764, MPI_DOUBLE_PRECISION, rbuf=0x8013160, rcounts=0x72b1060, displs=0x72b10b0, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80b7fa0 src=0x80b7fa0 len=94112
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x806b190, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x803bcb0, rcounts=0x72b0c00, displs=0x72b0c50, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x806b190 src=0x806b190 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x8084d60, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x800e930, rcounts=0x72b1060, displs=0x72b10b0, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x8084d60 src=0x8084d60 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x809e0f0, scount=11764, MPI_DOUBLE_PRECISION, rbuf=0x8010250, rcounts=0x72b10b0, displs=0x72b0b70, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x809e0f0 src=0x809e0f0 len=94112
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80703f0, scount=12110, MPI_DOUBLE_PRECISION, rbuf=0x8011a30, rcounts=0x72b10b0, displs=0x779fc20, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80703f0 src=0x80703f0 len=96880
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80ca510, scount=11764, MPI_DOUBLE_PRECISION, rbuf=0x800e730, rcounts=0x72b1100, displs=0x72b1150, MPI_DOUBLE_PRECISION, comm=0xc4010000) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80ca510 src=0x80ca510 len=94112
Fatal error in PMPI_Allgatherv: Internal MPI error!, error stack:
PMPI_Allgatherv(1430).....: MPI_Allgatherv(sbuf=0x80b6ac0, scount=11764, MPI_DOUBLE_PRECISION, rbuf=0x7fe3d40, rcounts=0x72b0c40, displs=0x72b0c90, MPI_DOUBLE_PRECISION, comm=0x84000006) failed
MPIR_Allgatherv_impl(1002):
MPIR_Allgatherv(958)......:
MPIR_Allgatherv_intra(708):
MPIR_Localcopy(381).......: memcpy arguments alias each other, dst=0x80b6ac0 src=0x80b6ac0 len=94112

The Makefile is below:

.SUFFIXES: .inc .f .f90 .F
#-----------------------------------------------------------------------
# Makefile for Intel Fortran compiler for Pentium/Athlon/Opteron
# based systems
# we recommend this makefile for both Intel as well as AMD systems
# for AMD based systems appropriate BLAS (libgoto) and fftw libraries are
# however mandatory (whereas they are optional for Intel platforms)
# For Athlon we recommend
# ) to link against libgoto (and mkl as a backup for missing routines)
# ) odd enough link in libfftw3xf_intel.a (fftw interface for mkl)
# feedback is greatly appreciated
#
# The makefile was tested only under Linux on Intel and AMD platforms
# the following compiler versions have been tested:
# - ifc.7.1 works stable somewhat slow but reliably
# - ifc.8.1 fails to compile the code properly
# - ifc.9.1 recommended (both for 32 and 64 bit)
# - ifc.10.1 partially recommended (both for 32 and 64 bit)
# tested build 20080312 Package ID: l_fc_p_10.1.015
# the gamma only mpi version can not be compiles
# using ifc.10.1
# - ifc.11.1 partially recommended (some problems with Gamma only and intel fftw)
# Build 20090630 Package ID: l_cprof_p_11.1.046
# - ifort.12.1 strongly recommended (we use this to compile vasp)
# Version 12.1.5.339 Build 20120612
#
# it might be required to change some of library path ways, since
# LINUX installations vary a lot
#
# Hence check ***ALL*** options in this makefile very carefully
#-----------------------------------------------------------------------
#
# BLAS must be installed on the machine
# there are several options:
# 1) very slow but works:
# retrieve the lapackage from ftp.netlib.org
# and compile the blas routines (BLAS/SRC directory)
# please use g77 or f77 for the compilation. When I tried to
# use pgf77 or pgf90 for BLAS, VASP hang up when calling
# ZHEEV (however this was with lapack 1.1 now I use lapack 2.0)
# 2) more desirable: get an optimized BLAS
#
# the two most reliable packages around are presently:
# 2a) Intels own optimised BLAS (PIII, P4, PD, PC2, Itanium)
# http://developer.intel.com/software/products/mkl/
# this is really excellent, if you use Intel CPU's
#
# 2b) probably fastest SSE2 (4 GFlops on P4, 2.53 GHz, 16 GFlops PD,
# around 30 GFlops on Quad core)
# Kazushige Goto's BLAS
# http://www.cs.utexas.edu/users/kgoto/signup_first.html
# http://www.tacc.utexas.edu/resources/software/
#
#-----------------------------------------------------------------------

# all CPP processed fortran files have the extension .f90
SUFFIX=.f90

#-----------------------------------------------------------------------
# fortran compiler and linker
#-----------------------------------------------------------------------
FC=ifort -I$(MKL_ROOT)/include/fftw -I$(MKLROOT)/include/mic/lp64 -I$(MKLROOT)/include -mmic
# fortran linker
FCL=$(FC) -static

#-----------------------------------------------------------------------
# whereis CPP ?? (I need CPP, can't use gcc with proper options)
# that's the location of gcc for SUSE 5.3
#
# CPP_ = /usr/lib/gcc-lib/i486-linux/2.7.2/cpp -P -C
#
# that's probably the right line for some Red Hat distribution:
#
# CPP_ = /usr/lib/gcc-lib/i386-redhat-linux/2.7.2.3/cpp -P -C
#
# SUSE X.X, maybe some Red Hat distributions:

CPP_ = ./preprocess <$*.F | /usr/bin/cpp -P -C -traditional >$*$(SUFFIX)

# this release should be fpp clean
# we now recommend fpp as preprocessor
# if this fails go back to cpp
#CPP_=fpp -f_com=no -free -w0 $*.F $*$(SUFFIX)

#-----------------------------------------------------------------------
# possible options for CPP:
# NGXhalf charge density reduced in X direction
# wNGXhalf gamma point only reduced in X direction
# avoidalloc avoid ALLOCATE if possible
# PGF90 work around some for some PGF90 / IFC bugs
# CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000-12000 P4, PD
# RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (depends on used BLAS)
# RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (depends on used BLAS)
# tbdyn MD package of Tomas Bucko
#-----------------------------------------------------------------------

CPP = $(CPP_) -DHOST=\"LinuxIFC\" \
-DCACHE_SIZE=12000 -DPGF90 -Davoidalloc -DNGZhalf \
# -DRPROMU_DGEMV -DRACCMU_DGEMV

#-----------------------------------------------------------------------
# general fortran flags (there must a trailing blank on this line)
# byterecl is strictly required for ifc, since otherwise
# the WAVECAR file becomes huge
#-----------------------------------------------------------------------

FFLAGS = -FR -names lowercase -assume byterecl -I$(MKLROOT)/include

#-----------------------------------------------------------------------
# optimization
# we have tested whether higher optimisation improves performance
# -axK SSE1 optimization, but also generate code executable on all mach.
# xK improves performance somewhat on XP, and a is required in order
# to run the code on older Athlons as well
# -xW SSE2 optimization
# -axW SSE2 optimization, but also generate code executable on all mach.
# -tpp6 P3 optimization
# -tpp7 P4 optimization
#-----------------------------------------------------------------------

# ifc.9.1, ifc.10.1 recommended
#OFLAG=-O2 -ip
OFLAG= -xHOST -O3 -ip -static
OFLAG_HIGH = $(OFLAG)
OBJ_HIGH =
OBJ_NOOPT =
DEBUG = -FR -O0
INLINE = $(OFLAG)

#-----------------------------------------------------------------------
# the following lines specify the position of BLAS and LAPACK
# we recommend to use mkl, that is simple and most likely
# fastest in Intel based machines
#-----------------------------------------------------------------------

# mkl path for ifc 11 compiler
#MKL_PATH=$(MKLROOT)/lib/em64t

# mkl path for ifc 12 compiler
MKL_PATH=$(MKLROOT)/lib/intel64

MKL_FFTW_PATH=$(MKLROOT)/interfaces/fftw3xf/

# BLAS
# setting -DRPROMU_DGEMV -DRACCMU_DGEMV in the CPP lines usually speeds up program execution
# BLAS= -Wl,--start-group $(MKL_PATH)/libmkl_intel_lp64.a $(MKL_PATH)/libmkl_intel_thread.a $(MKL_PATH)/libmkl_core.a -Wl,--end-group -lguide
# faster linking and available from at least version 11
#BLAS= -lguide -mkl
#BLAS = /home/paulfons/VASP/src/GotoBlas2/libgoto2_nehalemp-r1.13.a
BLAS = $(MKLROOT)/lib/intel64/libmkl_blas95_lp64.a $(MKLROOT)/lib/intel64/libmkl_lapack95_lp64.a $(MKLROOT)/lib/intel64/libmkl_scalapack_lp64.a -Wl,--start-group $(MKLROOT)/lib/intel64/libmkl_cdft_core.a $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a $(MKLROOT)/lib/intel64/libmkl_sequential.a $(MKLROOT)/lib/intel64/libmkl_core.a $(MKLROOT)/lib/intel64/libmkl_blacs_intelmpi_lp64.a -Wl,--end-group -lpthread -lm
# LAPACK, use vasp.5.lib/lapack_double

#LAPACK= ../vasp.5.lib/lapack_double.o

# LAPACK from mkl, usually faster and contains scaLAPACK as well

#LAPACK= $(MKL_PATH)/libmkl_intel_lp64.a

# here a tricky version, link in libgoto and use mkl as a backup
# also needs a special line for LAPACK
# this is the best thing you can do on AMD based systems !!!!!!

#BLAS = -Wl,--start-group /opt/libs/libgoto/libgoto.so $(MKL_PATH)/libmkl_intel_thread.a $(MKL_PATH)/libmkl_core.a -Wl,--end-group -liomp5
#LAPACK= /opt/libs/libgoto/libgoto.so $(MKL_PATH)/libmkl_intel_lp64.a

#-----------------------------------------------------------------------

LIB = -L../vasp.5.lib -ldmy \
../vasp.5.lib/linpack_double.o $(LAPACK) \
$(BLAS)

# options for linking, nothing is required (usually)
#LINK = -parallel
LINK =

#-----------------------------------------------------------------------
# fft libraries:
# VASP.5.2 can use fftw.3.1.X (http://www.fftw.org)
# since this version is faster on P4 machines, we recommend to use it
#-----------------------------------------------------------------------

FFT3D = fft3dfurth.o fft3dlib.o

# alternatively: fftw.3.1.X is slighly faster and should be used if available
#FFT3D = fftw3d.o fft3dlib.o /opt/libs/fftw-3.1.2/lib/libfftw3.a

# you may also try to use the fftw wrapper to mkl (but the path might vary a lot)
# it seems this is best for AMD based systems
#FFT3D = fftw3d.o fft3dlib.o $(MKL_FFTW_PATH)/libfftw3xf_intel.a
#INCS = -I$(MKLROOT)/include/fftw

#=======================================================================
# MPI section, uncomment the following lines until
# general rules and compile lines
# presently we recommend OPENMPI, since it seems to offer better
# performance than lam or mpich
#
# !!! Please do not send me any queries on how to install MPI, I will
# certainly not answer them !!!!
#=======================================================================
#-----------------------------------------------------------------------
# fortran linker for mpi
#-----------------------------------------------------------------------

#FC=mpif90
FC=mpiifort
FCL=$(FC)

#-----------------------------------------------------------------------
# additional options for CPP in parallel version (see also above):
# NGZhalf charge density reduced in Z direction
# wNGZhalf gamma point only reduced in Z direction
# scaLAPACK use scaLAPACK (recommended if mkl is available)
# avoidalloc avoid ALLOCATE if possible
# PGF90 work around some for some PGF90 / IFC bugs
# CACHE_SIZE 1000 for PII,PIII, 5000 for Athlon, 8000-12000 P4, PD
# RPROMU_DGEMV use DGEMV instead of DGEMM in RPRO (depends on used BLAS)
# RACCMU_DGEMV use DGEMV instead of DGEMM in RACC (depends on used BLAS)
# tbdyn MD package of Tomas Bucko
#-----------------------------------------------------------------------

#-----------------------------------------------------------------------

#CPP = $(CPP_) -DMPI -DHOST=\"LinuxIFC\" -DIFC \
# -DCACHE_SIZE=4000 -DPGF90 -Davoidalloc -DNGZhalf \
# -DMPI_BLOCK=262144 -Duse_collective -DscaLAPACK \
# -DRPROMU_DGEMV -DRACCMU_DGEMV
#CPP = $(CPP_) -DMPI -DHOST=\"LinuxIFC\" -DIFC \
# -DCACHE_SIZE=4000 -DPGF90 -Davoidalloc \
# -DMPI_BLOCK=262144 -Duse_collective -DscaLAPACK \
# -DRPROMU_DGEMV -DRACCMU_DGEMV
CPP = $(CPP_) -DMPI -DHOST=\"SiriusMKL_ifort13\" -DIFC \
-DCACHE_SIZE=4000 -DPGF90 -Davoidalloc -DwNGZhalf -DNGZhalf \
-DMPI_BLOCK=8000 -Duse_collective -DscaLAPACK -Dtbdyn \
-DRPROMU_DGEMV -DRACCMU_DGEMV

# -DMPI_BLOCK=8000 -Duse_collective -DscaLAPACK
#-----------------------------------------------------------------------
# location of SCALAPACK
# if you do not use SCALAPACK simply leave this section commented out
#-----------------------------------------------------------------------

# usually simplest link in mkl scaLAPACK
#BLACS= -lmkl_blacs_openmpi_lp64
#SCA= $(MKL_PATH)/libmkl_scalapack_lp64.a $(BLACS)
#SCA= -lmkl_scalapack_lp64 -lmkl_core0
#-----------------------------------------------------------------------
# libraries for mpi?
#-----------------------------------------------------------------------

LIB = -L../vasp.5.lib -ldmy \
../vasp.5.lib/linpack_double.o \
$(SCA) $(LAPACK) $(BLAS) -L/opt/intel/composer_xe_2013/mkl/lib/intel64/

#-----------------------------------------------------------------------
# parallel FFT
#-----------------------------------------------------------------------

# FFT: fftmpi.o with fft3dlib of Juergen Furthmueller
#FFT3D = fftmpi.o fftmpi_map.o fft3dfurth.o fft3dlib.o

# alternatively: fftw.3.1.X is slighly faster and should be used if available
#FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o /opt/local/fftw3/lib/libfftw3.a

# you may also try to use the fftw wrapper to mkl (but the path might vary a lot)
# it seems this is best for AMD based systems
FFT3D = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o /opt/intel/composer_xe_2013/mkl/interfaces/fftw3xf/libfftw3xf_intel.a
#INCS = -I$(MKLROOT)/include/fftw

#-----------------------------------------------------------------------
# general rules and compile lines
#-----------------------------------------------------------------------
BASIC= symmetry.o symlib.o lattlib.o random.o

SOURCE= base.o mpi.o smart_allocate.o xml.o \
constant.o jacobi.o main_mpi.o scala.o \
asa.o lattice.o poscar.o ini.o mgrid.o xclib.o vdw_nl.o xclib_grad.o \
radial.o pseudo.o gridq.o ebs.o \
mkpoints.o wave.o wave_mpi.o wave_high.o spinsym.o \
$(BASIC) nonl.o nonlr.o nonl_high.o dfast.o choleski2.o \
mix.o hamil.o xcgrad.o xcspin.o potex1.o potex2.o \
constrmag.o cl_shift.o relativistic.o LDApU.o \
paw_base.o metagga.o egrad.o pawsym.o pawfock.o pawlhf.o rhfatm.o hyperfine.o paw.o \
mkpoints_full.o charge.o Lebedev-Laikov.o stockholder.o dipol.o pot.o \
dos.o elf.o tet.o tetweight.o hamil_rot.o \
chain.o dyna.o k-proj.o sphpro.o us.o core_rel.o \
aedens.o wavpre.o wavpre_noio.o broyden.o \
dynbr.o hamil_high.o rmm-diis.o reader.o writer.o tutor.o xml_writer.o \
brent.o stufak.o fileio.o opergrid.o stepver.o \
chgloc.o fast_aug.o fock_multipole.o fock.o mkpoints_change.o sym_grad.o \
mymath.o internals.o npt_dynamics.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o \
nmr.o pead.o subrot.o subrot_scf.o \
force.o pwlhf.o gw_model.o optreal.o steep.o davidson.o david_inner.o \
electron.o rot.o electron_all.o shm.o pardens.o paircorrection.o \
optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o \
hamil_lr.o rmm-diis_lr.o subrot_cluster.o subrot_lr.o \
lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o \
linear_optics.o \
setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o \
mlwf.o ratpol.o screened_2e.o wave_cacher.o chi_base.o wpot.o \
local_field.o ump2.o ump2kpar.o fcidump.o ump2no.o \
bse_te.o bse.o acfdt.o chi.o sydmat.o dmft.o \
rmm-diis_mlr.o linear_response_NMR.o wannier_interpol.o linear_response.o

vasp: $(SOURCE) $(FFT3D) $(INC) main.o
rm -f vasp
$(FCL) -o vasp main.o $(SOURCE) $(FFT3D) $(LIB) $(LINK)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
$(FCL) -o makeparam $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
$(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
$(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
$(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
$(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:
-rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
$(FC) $(FFLAGS) $(INLINE) $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
$(FC) $(FFLAGS)$(DEBUG) $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F
#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
$(CPP)
$(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
$(CPP)
$(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
$(CPP)
$(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
$(CPP)
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
$(CPP)
$(SUFFIX).o:
$(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

# special rules
#-----------------------------------------------------------------------
# these special rules have been tested for ifc.11 and ifc.12 only

fft3dlib.o : fft3dlib.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
fft3dfurth.o : fft3dfurth.F
$(CPP)
$(FC) -FR -lowercase -O1 -c $*$(SUFFIX)
fftw3d.o : fftw3d.F
$(CPP)
$(FC) -FR -lowercase -O1 $(INCS) -c $*$(SUFFIX)
fftmpi.o : fftmpi.F
$(CPP)
$(FC) -FR -lowercase -O1 -c $*$(SUFFIX)
fftmpiw.o : fftmpiw.F
$(CPP)
$(FC) -FR -lowercase -O1 $(INCS) -c $*$(SUFFIX)
wave_high.o : wave_high.F
$(CPP)
$(FC) -FR -lowercase -O1 -c $*$(SUFFIX)
# the following rules are probably no longer required (-O3 seems to work)
wave.o : wave.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
paw.o : paw.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
cl_shift.o : cl_shift.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
us.o : us.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
LDApU.o : LDApU.F
$(CPP)
$(FC) -FR -lowercase -O2 -c $*$(SUFFIX)

James_T_Intel · ‎05-15-2013

Hi Paul,

What version of the Intel® MPI Library are you using? If you aren't, please try using 4.1.0.030.

Also, as a future reference, please attach large files rather than pasting the text directly.

Sincerely,
James Tullos
Technical Consulting Engineer
Intel® Cluster Tools

Ariel_B_1 · ‎01-27-2014

Any news on this?

James_T_Intel · ‎01-30-2014

I would make a small update to my recommendation, since we have released 3 updates since then. Are you encountering the same problem?

pigguo · ‎04-05-2014

I got same problem with VASP compiled by INTEL MPI 4.1.3.045

Vijay_Amirtharaj_A · ‎11-12-2014

Hi,

I also faced same problem.

You have to export this value, before submitting the job. Not only VASP, Siesta also will give same issue.

export I_MPI_COMPATIBILITY=4

For me, this error problem resolved.

Thanks,

Vijay Amirtharaj A

MPI crash in physics code: Fatal error in PMPI_Allgatherv