Relocation error - please, help!

dmitry424 · ‎02-04-2011

Hi everybody,

I compile my code with the following bash script:

LIB_PATH=/opt/intel/Compiler/11.1/073/mkl/lib/em64t/

INCLUDE_PATH=/opt/intel/Compiler/11.1/073/mkl/include/

ifort -w -c $INCLUDE_PATH"mkl_dfti.f90" -o mkl_dfti.o

ifort -static mkl_dfti.o ./my_code.f90 -L$LIB_PATH -Wl,--start-group $LIB_PATH"libmkl_intel_lp64.a" $LIB_PATH"libmkl_intel_thread.a" $LIB_PATH"libmkl_core.a" -Wl,--end-group -liomp5 -o ./my_code.sh

and have this output:

/tmp/ifortSHuua2.o: In function `MAIN__':

./my_code.f90:(.text+0x86): relocation truncated to fit: R_X86_64_PC32 against `function_calculations$STRIDES_IN.0.3'

./my_code.f90:(.text+0x8e): relocation truncated to fit: R_X86_64_PC32 against `function_calculations$STRIDES_OUT.0.3'

./my_code.f90:(.text+0x108): relocation truncated to fit: R_X86_64_PC32 against `function_calculations$LENGTHS.0.3'

./my_code.f90:(.text+0x10f): relocation truncated to fit: R_X86_64_PC32 against `function_calculations$LENGTHS.0.3'

./my_code.f90:(.text+0x116): relocation truncated to fit: R_X86_64_PC32 against `function_calculations$LENGTHS.0.3'

./my_code.f90:(.text+0xd32): relocation truncated to fit: R_X86_64_32S against `function_calculations$var$104.0.3'

./my_code.f90:(.text+0xd49): relocation truncated to fit: R_X86_64_32S against `function_calculations$var$104.0.3'

./my_code.f90:(.text+0xd6a): relocation truncated to fit: R_X86_64_32S against `function_calculations$var$108.0.3'

./my_code.f90:(.text+0xd7d): relocation truncated to fit: R_X86_64_32S against `function_calculations$var$108.0.3'

./my_code.f90:(.text+0xdba): relocation truncated to fit: R_X86_64_32S against `function_calculations$var$100.0.3'

./my_code.f90:(.text+0xdec): additional relocation overflows omitted from the output

If I add-openmp, it goes well, but later I have segmentation fault. "-mcmodel=large -shared-intel" or "-mcmodel=medium -shared-intel" doesn't change the situation at all. When I change -static to -i_dynamic, I have:

./my_code.sh: error while loading shared libraries: libiomp5.so: cannot open shared object file: No such file or directory

and

export LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/073/mkl/lib/em64t:; ./my_code.sh

doesn't help.

In my code I use large 3D arrays and MKL library (Intel Fast Fourier transform). I also use EQUIVALENCE statement (between 3D and 1D arraysfor these large arrays), so, if I understand correctly, allocatable variables will not work. I have similar code, but withoutEQUIVALENCE statements, it works with ~20Gb of RAM without any problems being compiled with the same bash script.

Could you please write me, is it possible to solve the problem without rewriting my code (especially, not removingEQUIVALENCE)?

Thank you very much in advance!

dmitry424 · ‎02-05-2011

P.S.Below is how my code initially looked like (with 3D arrays). It is fast, readable, but doesn't work with npt>=256 (I need equivalent 1D arraysBx_1D,By_1D,Bz_1D to work with 3D FFT).

program prog

Use MKL_DFTI

integer, parameter :: npt = 128

FUNCTION_RESULT = function_calculations(...);

contains

function function_calculations(...)

IMPLICIT NONE

real(DP) :: Bx(npt,npt,npt), By(npt,npt,npt), Bz(npt,npt,npt)

real(DP) :: Bx_1D(npt**3), By_1D(npt**3), Bz_1D(npt**3)

equivalence (Bx_1D, Bx);

equivalence (By_1D, By);

equivalence (Bz_1D, Bz);

CODE

end function

end program

I rewrote the above code to work with 3D arrays as 1D arrays. It doesn't use EQUIVALENCE, it is slower with npt=128, significantly less readable, but it works with npt=256, 512, etc.

program prog

Use MKL_DFTI

integer, parameter :: npt=128

FUNCTION_RESULT = function_calculations(npt, ...);

contains

function function_calculations(npt, ...)

IMPLICIT NONE

integer :: npt ...

real(DP) :: Bx(npt**3), By(npt**3), Bz(npt**3)

CODE

end function

end program

Interesting that here I define "npt" variable two times - as global parameter, and then inside "function_calculations". I can remove "npt" definition in"function_calculations" and call this function the same style as I did in 3D code, "FUNCTION_RESULT = function_calculations(...)", but this leads to the same problems I described in my initial post in this thread: to relocation error. I.e. if I remove "npt" definition inside "function_calculations", I cannot compile this code with npt>=256.

Could you please write me, is it possible to compile initial 3D code correctly, without relocation errors or segmentation faults?

mecej4 · ‎02-05-2011

It is difficult, if not impossible, to help when you provide only fragments of source code and leave out compiler and linker invocation command lines. For example, we know nothing about what is in my_code.f90 and my_code.sh.

When running shell scripts, it is helpful while debugging to add the -x option.

dmitry424 · ‎02-05-2011

mecej4,

Thanks for your reply. Entire compilation script is in my initial post of this thread. "my_code.sh" - is the output for ifort compiler (i.e., it is my aim to compile this executable "my_code.sh" and then run it as "./my_code.sh").

Concerning my code, OK, let's start from this simple example:

program prog

integer, parameter :: DP = kind(0.0D0)

real(DP), parameter :: pi = 3.141592653589793238_DP

integer, parameter :: npt = 512

real(DP) :: x(npt), tspan, dx

integer :: jx, jy, jz

real(DP) :: Bx(npt,npt,npt), By(npt,npt,npt)

real(DP) :: Yx(npt,npt,npt), Yy(npt,npt,npt), Yz(npt,npt,npt)

real(DP) :: Vx(npt,npt,npt), Vy(npt,npt,npt), Vz(npt,npt,npt)

real(DP) :: Vx_1D(npt**3), Vy_1D(npt**3), Vz_1D(npt**3)

equivalence (Vx_1D, Vx);

equivalence (Vy_1D, Vy);

equivalence (Vz_1D, Vz);

tspan = 2*pi;

dx = tspan/npt;

do j = 1,npt

x(j) = (tspan/npt)*( j - 1 - floor(0.5_DP*npt) );

end do

do jz = 1,npt

do jy = 1,npt

do jx = 1,npt

Bx(jx, jy, jz) = sin(x(jz));

By(jx, jy, jz) = sin(x(jx));

end do

call Derivatives(Bx,dx,Vx,Vy,Vz);

call Derivatives(By,dx,Yx,Yy,Yz);

print *, "All done!"

contains

subroutine Derivatives(X,dx,Xx,Xy,Xz)

IMPLICIT NONE

real(DP) :: X(npt,npt,npt), Xx(npt,npt,npt), Xy(npt,npt,npt), Xz(npt,npt,npt), dx

Xx(2:npt-1,1:npt,1:npt) = (X(3:npt,1:npt,1:npt)-X(1:npt-2,1:npt,1:npt))/(2*dx);

Xx(1,1:npt,1:npt) = (X(2,1:npt,1:npt)-X(npt,1:npt,1:npt))/(2*dx);

Xx(npt,1:npt,1:npt) = (X(1,1:npt,1:npt)-X(npt-1,1:npt,1:npt))/(2*dx);

Xy(1:npt,2:npt-1,1:npt) = (X(1:npt,3:npt,1:npt)-X(1:npt,1:npt-2,1:npt))/(2*dx);

Xy(1:npt,1,1:npt) = (X(1:npt,2,1:npt)-X(1:npt,npt,1:npt))/(2*dx);

Xy(1:npt,npt,1:npt) = (X(1:npt,1,1:npt)-X(1:npt,npt-1,1:npt))/(2*dx);

Xz(1:npt,1:npt,2:npt-1) = (X(1:npt,1:npt,3:npt)-X(1:npt,1:npt,1:npt-2))/(2*dx);

Xz(1:npt,1:npt,1) = (X(1:npt,1:npt,2)-X(1:npt,1:npt,npt))/(2*dx);

Xz(1:npt,1:npt,npt) = (X(1:npt,1:npt,1)-X(1:npt,1:npt,npt-1))/(2*dx);

end subroutine

end program

Compilation with

LIB_PATH=/opt/intel/Compiler/11.1/073/mkl/lib/em64t/

INCLUDE_PATH=/opt/intel/Compiler/11.1/073/mkl/include/

ifort -i8 -w -c $INCLUDE_PATH"mkl_dfti.f90" -o mkl_dfti.o

ifort -i8 -static mkl_dfti.o ./my_code.f90 -L$LIB_PATH -Wl,--start-group $LIB_PATH"libmkl_intel_ilp64.a" $LIB_PATH"libmkl_intel_thread.a" $LIB_PATH"libmkl_core.a" -Wl,--end-group -o ./my_code.sh

gives

/tmp/ifort7fwQWw.o: In function `MAIN__':

./my_code.f90:(.text+0x15c): relocation truncated to fit: R_X86_64_32 against `.bss'

./my_code.f90:(.text+0x17b): relocation truncated to fit: R_X86_64_32 against `.bss'

./my_code.f90:(.text+0x180): relocation truncated to fit: R_X86_64_32 against `.bss'

./my_code.f90:(.text+0x186): relocation truncated to fit: R_X86_64_32 against `.bss'

/opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o): In function `for__signal_handler':

for_init.c:(.text+0xec): relocation truncated to fit: R_X86_64_PC32 against `for__protect_handler_ops'

for_init.c:(.text+0x117): relocation truncated to fit: R_X86_64_PC32 against `for__protect_handler_ops'

for_init.c:(.text+0x131): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x14b): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_fpe_mask' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3a7): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3cd): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3fc): additional relocation overflows omitted from the output

When I add "-mcmodel=medium -shared-intel" to the last "ifort" line, I have

/opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o): In function `for__signal_handler':

for_init.c:(.text+0xec): relocation truncated to fit: R_X86_64_PC32 against `for__protect_handler_ops'

for_init.c:(.text+0x117): relocation truncated to fit: R_X86_64_PC32 against `for__protect_handler_ops'

for_init.c:(.text+0x131): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x14b): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_fpe_mask' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3a7): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3cd): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x3fc): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x402): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_undcnt' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x423): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x45c): relocation truncated to fit: R_X86_64_PC32 against symbol `for__l_excpt_info' defined in .bss section in /opt/intel/Compiler/11.1/073/lib/intel64/libifcore.a(for_init.o)

for_init.c:(.text+0x47d): additional relocation overflows omitted from the output

When I additionally add "-openmp", I have smooth compilation and segmentation fault of compiled "my_code.sh". "-i8" key changes nothing.

If I use only one call of subroutine "Derivatives" (i.e., if I remove line "call Derivatives(By,dx,Yx,Yy,Yz);" from my code), then with options"-mcmodel=medium -shared-intel -openmp" I have smooth compilation and execution of"my_code.sh" even up to npt=2048, that is, in fact, a problem: one real double precision 3D array 2048^3 must take 66Gb of RAM, while I have only 8Gb. So, in fact, something goes wrong here also.

Link Line Advisor gives me

$MKLROOT/libmkl_solver_ilp64.a -Wl,--start-group $MKLroot/libmkl_intel_ilp64.a $MKLroot/libmkl_intel_thread.a $MKLroot/libmkl_core.a -Wl,--end-group -openmp -lpthread

and this doesn't help.

mecej4 · ‎02-06-2011

For the code in #3, with npt=256, compiling with the command

$ ifort -mcmodel medium -shared-intel my_code.f90

creates an executable that runs to completion. For npt=512, the stack size is slightly over the 8GB of RAM on my machine, so the code does not run. However, raising the virtual memory limit with

$ ulimit -v 16474720

allows the program to run.

For the test program that you gave in #3, I don't see why you need to link in the MKL.

Martyn Corden's comments in this thread may be useful.

dmitry424 · ‎02-06-2011

Thank you, mecej4!

I need MKL for fast Fourier transform in my main code, of course I don't need it for the above test example.

You gave me the right direction - I solved my problem (for the main code) by changing static linking to dynamic one. Seems, static linking is not possible in my case (though, I don't understand why)..

Step 1: I set libraries path environment variable. Since MKL libraries are in/opt/intel/Compiler/11.1/073/mkl/lib/em64t, while ifort libraries - in/opt/intel/Compiler/11.1/073/lib/intel64, and I need them all, I had to use

export LD_LIBRARY_PATH=/opt/intel/Compiler/11.1/073/mkl/lib/em64t:/opt/intel/Compiler/11.1/073/lib/intel64

Step 2. I asked Intel Math Kernel Library Link Line Advisor concerning what are my options in case of dynamic linking. It gave me

-L$MKLROOT $MKLROOT/libmkl_solver_ilp64.a -Wl,--start-group -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -openmp -lpthread

I don't know why, but "-openmp" option constantly leads to segmentation fault. So I had to change it by "-liomp5".I now successfully compile my code with the help of the following bash script:

MKLROOT=/opt/intel/Compiler/11.1/073/mkl/lib/em64t/

INCLUDE_PATH=/opt/intel/Compiler/11.1/073/mkl/include/

ifort -w -c $INCLUDE_PATH"mkl_dfti.f90" -o mkl_dfti.o

ifort -xHOST -O2 -mcmodel=medium -shared-intel mkl_dfti.o ./my_code.f90 -L$MKLROOT $MKLROOT"libmkl_solver_ilp64.a" -Wl,--start-group -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -Wl,--end-group -liomp5 -lpthread -o ./my_code.sh

Again, thank you very much!

[SOLVED] Relocation error - please, help!