Compile options for using pardiso with large matrices

zysermanfcaglp_unlp_ · ‎06-17-2009

Hi, I have been using pardiso (included in mkl) to solve complex sparse linear systems

What I do is:
export OMP_NUM_THREADS=nn (number of processors to use in pardiso)
export MKL_LIB=/opt/intel/mkl/10.1.0.015/lib/em64t

and compile as:

ifort -w -I /opt/intel/mkl/10.1.0.015/include/ gen_mod_compres_1_patch-paralelo.f -L${MKL_LIB} ${MKL_LIB}/libmkl_solver_lp64.a ${MKL_LIB}/libmkl_intel_lp64.a -Wl,--start-group ${MKL_LIB}/libmkl_intel_thread.a ${MKL_LIB}/libmkl_core.a -Wl,--end-group -L${MKL_LIB} -liomp5 -lpthread -lm -o test.out

Now, I want to follow the usual procedure, for a large? matrix (say, coefficient matrix is 1000x1000).
I get the following:

gen_mod_compres_1_patch-paralelo.f(336): (col. 12) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(349): (col. 12) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(349): (col. 12) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(428): (col. 7) remark: PERMUTED LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1729): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1732): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1745): (col. 13) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1745): (col. 13) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1838): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1838): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1847): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1847): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1857): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1857): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1957): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1961): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1894): (col. 10) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1911): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1689): (col. 7) remark: PERMUTED LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1411): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1411): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1386): (col. 7) remark: LOOP WAS VECTORIZED.
/tmp/ifortnTonDl.o: In function `MAIN__':
gen_mod_compres_1_patch-paralelo.f:(.text+0x3c38): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x3ca8): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x3d18): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x3d88): relocation truncated to fit: R_X86_64_PC32 against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x47d9): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x4849): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x48b9): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x4929): relocation truncated to fit: R_X86_64_PC32 against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x4c7e): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x4ccc): relocation truncated to fit: R_X86_64_32S against `.bss'
gen_mod_compres_1_patch-paralelo.f:(.text+0x4d1a): additional relocation overflows omitted from the output

Now, if I add
-mcmodel=large -i-dynamic
as options when I compile, I get:
en_mod_compres_1_patch-paralelo.f(336): (col. 12) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(349): (col. 12) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(349): (col. 12) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(428): (col. 7) remark: PERMUTED LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1729): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1732): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1745): (col. 13) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1745): (col. 13) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1838): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1838): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1847): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1847): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1857): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1857): (col. 26) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1957): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1961): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1894): (col. 10) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1911): (col. 7) remark: LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1689): (col. 7) remark: PERMUTED LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1411): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1411): (col. 7) remark: PARTIAL LOOP WAS VECTORIZED.
gen_mod_compres_1_patch-paralelo.f(1386): (col. 7) remark: LOOP WAS VECTORIZED.
/opt/intel/mkl/10.1.0.015/lib/em64t/libmkl_core.a(c_amuxy_res_lp64.o): In function `mkl_pds_lp64_c_amuxy_res_pardiso':
__tmp_lp64_c_amuxy_res.f:(.text+0x4b9): relocation truncated to fit: R_X86_64_PC32 against `.bss'
__tmp_lp64_c_amuxy_res.f:(.text+0xb1c): relocation truncated to fit: R_X86_64_PC32 against `.bss'
__tmp_lp64_c_amuxy_res.f:(.text+0x3ac7): relocation truncated to fit: R_X86_64_PC32 against `.bss'
/opt/intel/mkl/10.1.0.015/lib/em64t/libmkl_core.a(amuxy_res_lp64.o): In function `mkl_pds_lp64_amuxy_res_pardiso':
__tmp_lp64_amuxy_res.f:(.text+0x4cc): relocation truncated to fit: R_X86_64_PC32 against `.bss'
__tmp_lp64_amuxy_res.f:(.text+0xb26): relocation truncated to fit: R_X86_64_PC32 against `.bss'
__tmp_lp64_amuxy_res.f:(.text+0x2b27): relocation truncated to fit: R_X86_64_PC32 against `.bss'
/opt/intel/mkl/10.1.0.015/lib/em64t/libmkl_core.a(mkl_msg_support.o): In function `mkl_serv_mkl_get_msg':
__tmp_mkl_msg_support.c:(.text+0xa3): relocation truncated to fit: R_X86_64_PC32 against `get_msg_buf'
__tmp_mkl_msg_support.c:(.text+0xb2): relocation truncated to fit: R_X86_64_PC32 against `get_msg_buf'
/opt/intel/mkl/10.1.0.015/lib/em64t/libmkl_core.a(mkl_msg_support.o): In function `message_catalog_open':
__tmp_mkl_msg_support.c:(.text+0x155): relocation truncated to fit: R_X86_64_PC32 against `message_catalog'
__tmp_mkl_msg_support.c:(.text+0x1d9): relocation truncated to fit: R_X86_64_PC32 against `message_catalog'
/opt/intel/mkl/10.1.0.015/lib/em64t/libmkl_core.a(mkl_msg_support.o): In function `message_catalog_get_text':
__tmp_mkl_msg_support.c:(.text+0x1f5): additional relocation overflows omitted from the output

Of course, the executable file is not created.
What is going wrong, and what can I do to solve this problem?

zyserman · ‎06-18-2009

I managed to solve this problem myself. It was just a matter of linking against ilp instead of lp, that is,
64 bit integers are needed.
By the way, I realized it by using the line linking tool offered in this forum! Many thanks for it!

Fabio Zyserman