Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Jo_H_
Beginner
198 Views

MPI problems with parallel SIESTA

Hello,

I need to use the scientific software package SIESTA 3.2 (TranSIESTA actually) but I'm having a hard time getting the code to run on our cluster. With my arch.make I probably give a good overview on the specs I used (I used the Math Kernel Library Link Line Advisor). The lntel compiler/mpi/mkl versions are the most recent available on this cluster.

SIESTA_ARCH=intel-mpi
#
.SUFFIXES: .f .F .o .a .f90 .F90
#
FC=mpiifort
  #Path is: /Applic.PALMA/software/impi/5.0.2.044-iccifort-2015.1.133-GCC-4.9.2/bin64
#
FC_ASIS=$(FC)
#
RANLIB=ranlib
#
SYS=nag
#
MKL_ROOT=/Applic.PALMA/software/imkl/11.2.1.133-iimpi-7.2.3-GCC-4.9.2/composerxe/mkl
#
FFLAGS=-g -check all -traceback -I${MKL_ROOT}/include/intel64/lp64 -I${MKL_ROOT}/include
FPPFLAGS_MPI=-DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
FPPFLAGS= $(FPPFLAGS_MPI) $(FPPFLAGS_CDF)
#
MPI_INTERFACE=libmpi_f90.a
MPI_INCLUDE=/Applic.PALMA/software/impi/5.0.2.044-iccifort-2015.1.133-GCC-4.9.2/include64
#
COMP_LIBS=dc_lapack.a
#
MKL_LIB=-L${MKL_ROOT}/lib/intel64
#
BLAS_LIBS=-lmkl_blas95_lp64
#
LAPACK_LIBS=-lmkl_lapack95_lp64
#
BLACS_LIBS=-lmkl_blacs_lp64 -lmkl_blacs_intelmpi_lp64     # Using openMPI didn't work (yes I mpi-included it).
#
SCALAPACK_LIBS=-lmkl_scalapack_lp64
#
EXTRA_LIBS= -lmkl_intel_lp64 -lmkl_core -lm -lpthread -lmkl_sequential      # Intel thread compilation doesn't work.
#  
LIBS=$(MKL_LIB) $(SCALAPACK_LIBS) $(BLACS_LIBS) $(LAPACK_LIBS) $(BLAS_LIBS) $(NETCDF_LIBS) $(EXTRA_LIBS)
#
.F.o:
  $(FC) -c $(INCFLAGS) $(FFLAGS)  $(FPPFLAGS) $<
.f.o:
  $(FC) -c $(INCFLAGS) $(FFLAGS)   $<
.F90.o:
  $(FC) -c $(INCFLAGS) $(FFLAGS)  $(FPPFLAGS) $<
.f90.o:
  $(FC) -c $(INCFLAGS) $(FFLAGS)   $<

With these settings, the compilation will work. The environment is set coherent to the locations in the arch.make at execution (at least I think it is).

Now there are some warnings at execution, for example:

forrtl: warning (406): fort: (1): In call to ATOM_MAIN, an array temporary was created for argument #3

Image              PC                Routine            Line        Source
transiesta         0000000001A3B410  Unknown               Unknown  Unknown
transiesta         000000000080263A  initatom_                 105  initatom.f
transiesta         0000000000CBD134  m_siesta_init_mp_         147  siesta_init.F
transiesta         0000000000CF3393  MAIN__                     16  siesta.F
transiesta         000000000040ADFE  Unknown               Unknown  Unknown
libc.so.6          0000003E10A1D9F4  Unknown               Unknown  Unknown
transiesta         000000000040AC89  Unknown               Unknown  Unknown
forrtl: warning (406): fort: (1): In call to ATOM_MAIN, an array temporary was created for argument #4

Image              PC                Routine            Line        Source
transiesta         0000000001A3B410  Unknown               Unknown  Unknown
transiesta         000000000080278E  initatom_                 105  initatom.f
transiesta         0000000000CBD134  m_siesta_init_mp_         147  siesta_init.F
transiesta         0000000000CF3393  MAIN__                     16  siesta.F
transiesta         000000000040ADFE  Unknown               Unknown  Unknown
libc.so.6          0000003E10A1D9F4  Unknown               Unknown  Unknown
transiesta         000000000040AC89  Unknown               Unknown  Unknown


... and so on. I don't understand if this causes another problem but the code will work until the following errors occur:

Fatal error in PMPI_Comm_size: Invalid communicator, error stack:
PMPI_Comm_size(124): MPI_Comm_size(comm=0x5b, size=0x2364a2c) failed
PMPI_Comm_size(78).: Invalid communicator
Fatal error in PMPI_Comm_size: Invalid communicator, error stack:
PMPI_Comm_size(124): MPI_Comm_size(comm=0x5b, size=0x2364a2c) failed
PMPI_Comm_size(78).: Invalid communicator
Fatal error in PMPI_Comm_size: Invalid communicator, error stack:
PMPI_Comm_size(124): MPI_Comm_size(comm=0x5b, size=0x2364a2c) failed
PMPI_Comm_size(78).: Invalid communicator
Fatal error in PMPI_Comm_size: Invalid communicator, error stack:
PMPI_Comm_size(124): MPI_Comm_size(comm=0x5b, size=0x2364a2c) failed
PMPI_Comm_size(78).: Invalid communicator


(the program was executed on 4 CPU's). I tried to get it running with Intel MPI, SIESTA's own MPI implementation and MPICH2 but I always get the same error. Since I have little to no experience with this I was hoping I could get some advice here.

Thanks and regards!

Tags (1)
0 Kudos
2 Replies
TimP
Black Belt
198 Views

The following forum is better suited to questions related to Intel MPI:

https://software.intel.com/en-us/forums/intel-clusters-and-hpc-technology

Array temporaries aren't necessarily a problem, although they may consume more stack and thus require adjustments in shell stack limits and OMP_STACKSIZE.  Those would have to be set up to occur with each new shell opened under MPI.  With such an old compiler, it's hard to say whether to recommend -heap-arrays.

Intel MPI and OpenMPI have conflicting codings of MPI data types, so you can't mix the MKL library intended for one with the MPI environment of the other.

Jo_H_
Beginner
198 Views

Hi Tim,

thanks for your reply. Good to know that the errors related to array temporaries isn't a severe thing. I will post this thread to the suggested forum.

Thanks and regards!

Reply