performance degradation: Dynamic vs Static linking of MKL

Handsome_T · ‎02-15-2012

Hello,

We have a Fortran code that uses Pardiso. Everything was fine until we had to increase the size of the main matrices that are being solved. Unfortunately the old way of compiling/linking the code started issuing 'truncation' errors, similar to this:

relocation truncated to fit: R_X86_64_PC32 against symbol `message_catalog' defined in COMMON section

So we were told that we needed to use dynamic linking and '-shared-intel' option to fix the memory problem above.

Since then, the performance has seen a huge hit. Before all 8-core on our machine were used but now we see only 2 to 6 cores being used.

So My main question: is this something expected and non-fixable or we are doing something wrong?

Detail:

We are using Intel Compiler 11.0, which seems to include 10.1 version of MKL.

This is how we statically compiled our code:

ifort foo.f -O -tune host -L/path/to/mkl/em64t /path/to/em64t/libmkl_solver.a /path/to/mkl/em64t/libmkl_lapack.a /path/to/fce/lib/libguide.a -lmkl_em64t -lpthread

The new way (causing reduced performance) is:

ifort foo.f -mcmodel=large -shared-intel -O -tune host -lmkl_core -lguide -lmkl_intel_lp64 -lmkl -lmkl_solver -lmkl_lapack -lguide -lmkl_em64t -lpthread

where LD_LIBRARY_PATH includes/paht/to/Compiler/11.0/081/mkl/lib/em64t

I should add that I've tried the 'Link Advisor' too, but the resulting executable 'seg faults'!

Any hint/input regarding this issue is greatly appreciated,

thanks in advance

TimP · ‎02-15-2012

libguide was obsoleted before ifort 11.0 was issued, so you should be using -liomp5 and MKL which came with the compiler. There should be no libguide or libmkl_lapack in that MKL. Perhaps you are mixing files from various older versions. Link advisor would not be relevant to an MKL which came much earlier.

Gennady_F_Intel · ‎02-15-2012

- as Tim already indicated, you need to replace llguide by liomp5 and

- I don't see the-lmkl_intel_thread into this linking line. please add and check again how it will work.

please refer to the Linked Adviser again.

--Gennady

Handsome_T · ‎02-19-2012

Thanks for your comments.

I tried recompiling using following options (still Intel Compiler 11.0):

ifort -mcmodel=large -shared-intel -O2 -tune host -lmkl -lmkl_em64t -liomp5 -lmkl_intel_thread foo.f

The resulting executable is till slower than the one I getusing static linking!!!

I also manged to compile it on a machine with Intel Composer XE 2011 whose performance matched that of the static linking. The only flag I used for Composer was

-mkl=parallel

I really don't know what to do at this point.

thanks again,

Gennady_F_Intel · ‎02-19-2012

Well, Composer XE 2011 contains MKL version 10.3 ( see here the link for getting these version) and setting mkl=parallel

It will allow you to link with standard threaded Intel MKL, see link to the linkier adviser on the top of the forum.

In the case if you see the real degradation with the 10.3 version too, then what is the level of that degradation?

What is the size of the input?

It would be better if you give us the test case for checking the problem on our side.

--Gennady