Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Chansup_Byun
Beginner
92 Views

Failed HPCC built with FFTW from MKL on Linux 64 bit machine

Hi,

I built HPCC using Intel MKL FFTW library following instruction in the following article:
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-use-of-intel-mkl-...
I used the following compiler version:

$ icc -V
Intel C Intel 64 Compiler XE for applications running on Intel 64, Version 12.0.0.084 Build 20101006
$ which mpiicc
/opt/intel/impi/4.0.1/bin64/mpiicc

The binary was failed with the following error:

$ mpirun -machinefile ~/machinefile -n 2 hpcc
HPL WARNING from process # 0, on line 313 of function HPL_pdinfo:
>>> cannot open file hpccinf.txt <<<

rank 1 in job 1 compute-0-0-10gb_44496 caused collective abort of all ranks
exit status of rank 1: killed by signal 11

It failed right after starting StarFFT section according to the output.

$ tail hpccoutf.txt
Generation time: 0.565
Tuning: 1.438
Computing: 1.009
Inverse FFT: 1.359
max(|x-x0|): 1.715e+00
Gflop/s: 1.536
Current time (1303986275) is Thu Apr 28 06:24:35 2011

End of MPIFFT section.
Begin of StarFFT section.

The system log message shows segfault error:

Apr 28 06:24:39 compute-0-0 kernel: hpcc[23811]: segfault at fffffffffffffff9 ip 00000000004c3666 sp 00007fff1b399180 error 4 in hpcc[400000+16f8000]
Apr 28 06:24:39 compute-0-0 kernel: hpcc[23810]: segfault at fffffffffffffff9 ip 00000000004c3666 sp 00007fffd2a14e00 error 4 in hpcc[400000+16f8000]

Below is my makefile entry to build hpcc using FFTW from MKL.

LAdir = /opt/intel/mkl/lib/intel64
LAinc =
LAlib = -Wl,--start-group $(LAdir)/libmkl_intel_lp64.a $(LAdir)/libmkl_sequential.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_intelmpi_lp64.a $(LAdir)/libfftw2x_cdft_DOUBLE_ilp64.a $(LAdir)/libfftw2xc_intel.a $(LAdir)/libmkl_cdft_core.a -Wl,--end-group /opt/intel/lib/intel64/libiomp5.a -lpthread -lm

F2CDEFS = -DAdd_ -DF77_INTEGER=int -DStringSunStyle
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc) -I/opt/intel/mkl/include/fftw
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm
HPL_OPTS = -DUSING_FFTW -DMKL_INT=long -DLONG_IS_64BITS -DRA_SANDIA_OPT2 -DHPCC_FFT_235
HPL_DEFS = -g $(F2CDEFS) $(HPL_OPTS) $(HPL_INCLUDES)
CC = mpiicc
CCNOOPT = $(HPL_DEFS)
CCFLAGS = $(HPL_DEFS) $(MKLINCDIR) -O3 -ansi-alias -ip -fno-alias
LINKER = mpiicc

If you have any suggestions, please let me know.

Thanks,
- CB



0 Kudos
3 Replies
Vladimir_Petrov__Int
New Contributor III
92 Views

Hi,

Judging by the line ">>> cannot open file hpccinf.txt <<<" I guess that you did not provide an input file to the benchmark. Please note that the example file has an underscore in its name - "_hpccinf.txt". Feel free to rename it and modify per your needs.

Another thing that I have noticed is that in the list of libraries the FFTW wrapper libraries must go first:
LAlib = -Wl,--start-group $(LAdir)/libfftw2x_cdft_DOUBLE_ilp64.a $(LAdir)/libfftw2xc_intel.a $(LAdir)/libmkl_intel_lp64.a $(LAdir)/libmkl_sequential.a $(LAdir)/libmkl_core.a $(LAdir)/libmkl_blacs_intelmpi_lp64.a $(LAdir)/libmkl_cdft_core.a -Wl,--end-group /opt/intel/lib/intel64/libiomp5.a -lpthread -lm


Best regards,
-Vladimir
Chansup_Byun
Beginner
92 Views

Hi Vladimir,

I used "hpccmemf.txt" instead of "hpccinf.txt" for input.
Anyway, I rebuilt the binary following your suggestion but it still failed with the same error at the same spot.

$ mpirun -n 2 hpcc
rank 0 in job 1 compute-0-0_47995 caused collective abort of all ranks
exit status of rank 0: killed by signal 9

From system log:
Apr 28 08:44:58 compute-0-0 kernel: hpcc[14131]: segfault at fffffffffffffff9 ip 00000000004c3666 sp 00007fffa4ed6000 error 4 in hpcc[400000+16f8000]

Any other suggestions?

Thanks,
- CB
Vladimir_Petrov__Int
New Contributor III
92 Views

Hi,

Please make sure you run "make clean arch=XXX" (where XXX is your arch) before you rebuild the hpcc binary with the new library order.

Best regards,
-Vladimir
Reply