Intel® Moderncode for Parallel Architectures
Support for developing parallel programming applications on Intel® Architecture.

HPL and MKL 11.1 (064) Seg fault

i1ya
Beginner
1,257 Views

Hi everyone!

I've installed Intel C++/Fortran Compiler Professional Edition for Linux with MKL to my cluster. And I have problem while running HPL compiled with MKL.

Cluster node:

CPU: 2 x Intel Xeon CPU E5450

Network: Ethernet/Infiniband

Memory: 16 Gb.

OS: CentOS 5.4

MPI:

openmpi-1.4 (using Intel compilers).

HPL make file:

#====

SHELL = /bin/sh
#
CD = cd
CP = cp
LN_S = ln -s
MKDIR = mkdir
RM = /bin/rm -f
TOUCH = touch

ARCH = Linux_PII_CBLAS

TOPdir = $(HOME)/bench/hpl
INCdir = $(TOPdir)/include
BINdir = $(TOPdir)/bin/$(ARCH)
LIBdir = $(TOPdir)/lib/$(ARCH)
#
HPLlib = $(LIBdir)/libhpl.a

MPdir = /opt/openmpi/intel/
MPinc = -I$(MPdir)/include
MPlib = -L $(MPdir)/lib/ -lmpi

LAdir = /opt/intel/Compiler/11.1/064/mkl/lib/em64t/
LAinc = -I /opt/intel/Compiler/11.1/064/mkl/include/
LAlib = $(LAdir)/libmkl_scalapack_ilp64.a $(LAdir)/libmkl_solver_ilp64.a -Wl,--start-group $(LAdir)/libmkl_intel_ilp64.a $(LAdir)/ ibmkl_intel_thread.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_openmpi_ilp64.a -Wl,--end-group -openmp -lpthread

F2CDEFS =

HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib)

HPL_OPTS = -DHPL_CALL_CBLAS

HPL_DEFS = $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)

CC = mpicc
CCNOOPT = $(HPL_DEFS)
CCFLAGS = $(HPL_DEFS)

LINKER = mpicc
LINKFLAGS = $(CCFLAGS)

ARCHIVER = ar
ARFLAGS = r
RANLIB = echo
#====

Compilation is OK. But when I run HPL I get:

[umt1:09761] *** Process received signal ***
[umt1:09761] Signal: Segmentation fault (11)
[umt1:09761] Signal code: (-6)
[umt1:09761] Failing at address: 0x2c4200002621
[umt1:09761] [ 0] /lib64/libpthread.so.0 [0x359200e4c0]
[umt1:09761] [ 1] /lib64/libpthread.so.0(raise+0x2d) [0x359200e38d]
[umt1:09761] [ 2] /opt/intel/Compiler/11.1/064/lib/intel64/libiomp5.so [0x2b613a32a4a2]
[umt1:09761] *** End of error message ***
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 9761 on node umt1 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

If recompile HPL with ATLAS - I have no problem with compilation and running.

0 Kudos
3 Replies
TimP
Honored Contributor III
1,257 Views

Questions about MKL will get expert attention sooner on the MKL forum.

2 questions:

Are you using 64-bit integer arguments consistently for the ilp64 link option?

Did you check whether you have set a sufficient stack limit?

0 Kudos
i1ya
Beginner
1,257 Views

About first question: Where I can see it ?

About second: I think so, here is my ulimit output:

[u1330@umt100 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 135167
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 10000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 135167
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

0 Kudos
TimP
Honored Contributor III
1,257 Views
Quoting i1ya

stack size (kbytes, -s) 10240

This looks quite small.

http://software.intel.com/en-us/forums/intel-math-kernel-library/
0 Kudos
Reply