Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.

Intel MKL 11.0, static link to mkl_core.lib, more than 300 MB?

Zhanghong_T_
Novice
795 Views

Dear all,

I have a chance to test the latest MKL 11.0 in Vindows 7 64bit + VS2010 + Intel Composer XE 2013, I need to call the PARDISO, to my supprise, the link libraies setting in my project can't be found, I have to link the mkl_core.lib, after several miniutes waiting, I got the static linked library, it reaches to more than 300 MB, So terrible! Why not let the program only link the needed objects? or I don't have a correct project setting?

In addition, it seems that with the latest MKL 11.0 + Intel Composer XE 2013, the generated executable file is much slower than MKL 10.x + Intel Composer XE 2012, what's wrong?

Thanks,

Zhanghong Tang

0 Kudos
16 Replies
mecej4
Honored Contributor III
795 Views
Your post contains reports of many problems, but little information that can be used to troubleshoot the problems. How did you compile? What compiler options were in effect? Do you have a build log that you can post here? For instance, you wrote, "or I don't have a correct project setting?" How is it possible to answer that question without knowing what project settings you used, either explicitly or by inheritance from configuration files? Secondly, when you wrote "the generated executable file is much slower than MKL 10.x + Intel Composer XE 2012, ", what was the source code that you used to reach this conclusion?
0 Kudos
Zhanghong_T_
Novice
795 Views
Hi, Thank you very much for your kindly reply. The build log is as follows: 1>------ Build started: Project: solver, Configuration: Release x64 ------ 2>------ Build started: Project: randgen, Configuration: Release x64 ------ 1>Compiling with Intel(R) Visual Fortran Compiler XE 13.0.0.089 [Intel(R) 64]... 1>solver.f90 2>Build started 2012/10/26 23:51:42. 2>InitializeBuildStatus: 2> Creating "x64\Release\randgen.unsuccessfulbuild" because "AlwaysCreate" was specified. 2>ClCompile: 2> randomgen.c 2>C:\Program Files (x86)\MSBuild\Microsoft.Cpp\v4.0\Microsoft.CppBuild.targets(1151,5): warning MSB8012: TargetPath(D:\solver\x64\Release\randgen.lib) does not match the Library's OutputFile property value (D:\solver\lib64\randgen.lib). This may cause your project to build incorrectly. To correct this, please make sure that $(OutDir), $(TargetName) and $(TargetExt) property values match the value specified in %(Lib.OutputFile). 2>Lib: 2> randgen.vcxproj -> D:\solver\x64\Release\randgen.lib 2>FinalizeBuildStatus: 2> Deleting file "x64\Release\randgen.unsuccessfulbuild". 2> Touching "x64\Release\randgen.lastbuildstate". 2> 2>Build succeeded. 2> 2>Time Elapsed 00:00:01.69 1>Creating library... 1>randgen.lib(WELL1024a.obj) : MSIL .netmodule or module compiled with /GL found; restarting link with /LTCG; add /LTCG to the link command line to improve linker performance 1> 1>Build log written to "file://D:\solver\x64\Release\BuildLog.htm" 1>solver - 0 error(s), 0 warning(s) ========== Build: 2 succeeded, 0 failed, 0 up-to-date, 0 skipped ========== The project settings are: /nologo /O2 /fpp /I"C:\Program Files (x86)\Intel\Composer XE 2013\mkl\include" /DWIN64 /Qsave /module:"x64\Release/" /object:"x64\Release/" /Fd"x64\Release\vc100.pdb" /libs:static /threads /c /OUT:"solver.lib" /LIBPATH:".\lib64" /LIBPATH:"C:\Program Files (x86)\Intel\Composer XE 2013\mkl\lib\intel64" /NOLOGO randgen.lib mkl_intel_lp64.lib mkl_sequential.lib mkl_core.lib The source code is just calling PARDISO to solve a large sparse matrix which comes from FEM. The feeling not real timing data shows that the 11.0 is slower:) Thanks, Zhanghong Tang
0 Kudos
mecej4
Honored Contributor III
795 Views
If you can provide the source codes (preferably a small example that displays the problem), it would make it easier to track down the problems.
0 Kudos
Zhanghong_T_
Novice
795 Views
Hi mecej4, Thank you very much for your kindly reply. When build the attached project in Windows 7 64bit + VS2010 + MKL 11.0 + Intel Composer XE 2013, we can get the static linked library with size 300 MB. Thanks, Zhanghong Tang
0 Kudos
mecej4
Honored Contributor III
795 Views
Thanks for posting the Zip file with the project files. Your project seeks to create a static library from the MKL example pardiso_sym_f.f, with the PROGRAM statement replaced by SUBROUTINE and with no arguments. I do not understand why you want to do this: (i) what do you propose to do with the library? (ii) why do you want a static library? Note that a self-contained static library containing the runtime libraries for Fortran and MKL will be huge -- essentially the sum of the sizes of all the Intel-provided libraries provided for a particular architecture. For reference: I created a DLL from your source file using the command ifort /Qmkl /fast /LD pardiso_sym_f.f and the resulting DLL sizes were: Ifort 12.1.5.344, Build 20120612, X64: 14,848 bytes Ifort 13.0.1.119, Build 20121008, X64: 15,872 bytes
0 Kudos
Zhanghong_T_
Novice
795 Views
Hi mecej4, Thank you very much for your kindly reply. I just put this example to show that the problem can be reproduced. I noticed that when use 10.x, the size of lib file is only about 8 MB. I am used to use static linked library instead of dynamic linked library since the program linked by former doesn't affected by other changes. Thanks, Zhanghong Tang
0 Kudos
Zhanghong_T_
Novice
795 Views
Dear all, I noticed that udpate 1 of 11.0 is ready. Can this updated version solve the large size problem introduced by static linking? Thanks, Zhanghong Tang
0 Kudos
Gennady_F_Intel
Moderator
795 Views
No, the size of the static library would be the same as you have seen with 11.0 version.in this case when you want to create static lib from mkl’s static lib – this is the expected results.
0 Kudos
Zhanghong_T_
Novice
795 Views
Hi Gennady, Thank you very much for your kindly reply. So you mean that this case will keep since 11.0? The size of the static library is only about 8 MB if Iink the code with 10.x version. Thanks, Zhanghong Tang
0 Kudos
junziyang
Beginner
795 Views

I also found that MKL 11.0 with ivf2013is much slower than MKL 10.x with ivf2011. 

I used it to compiler FORTRAN MEX files on win8 with MATLAB 2012b x64 with the following option setting:

set COMPILER=ifort
set COMPFLAGS=/fpp /Qprec /I"%MATLAB%/extern/include" /c /nologo /free /fp:source /MT /assume:bscc
set OPTIMFLAGS=/O3 /DNDEBUG /QxHost /Qvec-report /Qftz
set DEBUGFLAGS=/Z7
set NAME_OBJECT=/Fo

In the MEX file, only subroution MATMUL is used.

For mkl 10.x with ivf2011, the excution time is ~348 s while for mkl 11.0 with ivf2013, the excution time is ~518 s. 

0 Kudos
junziyang
Beginner
795 Views

Besides, my laptop has a i3-M330 CPU with 8Gbit memory.

Both visions are tested on the same laptop and compiled with exactaly the same options.

I've also tested the 32bit version, it's slower than the 64bit version, and ivf2013 still much slower than ivf2011.

0 Kudos
Gennady_F_Intel
Moderator
795 Views

junziyang wrote:

I also found that MKL 11.0 with ivf2013is much slower than MKL 10.x with ivf2011. 

I used it to compiler FORTRAN MEX files on win8 with MATLAB 2012b x64 with the following option setting:

How we can check it on our side? can we get us the C/C++ or F77/F90 examples to check?

what is the problem size? routines? CPU type?

0 Kudos
junziyang
Beginner
795 Views

Sorry. It's a FORTRAN 90 MEX file in a MATLAB project. So it's imposible to run and test it seperately.

The matrix size is about 200x100x1000000. A matrix of 200x100 is used in the calculation and then store the results into sequential pages of a 200x100x1000000 matrix and then store it to the harddisk.

The SAME program is compiled with the SAME options with different version of compiler and tested with the SAME parameters.

Most of the computation time is involved with MATMUL.

0 Kudos
Gennady_F_Intel
Moderator
795 Views

Ok, is that double precision or complex double or another one? 

and i see you used Intel Core i3 Processor. Did you link with threaded or sequentional libs:  mkl_sequential.lib mkl_intel_thread.lib ?

0 Kudos
junziyang
Beginner
795 Views

Yes. I put all the *.lib on the LINKFLAGS path.

It's double precison.

0 Kudos
Sosunova_M_
Beginner
795 Views

I’ve tried s-,d-, and cgemm performance on Core i7 machine with RHEL Server 6.3 64-bit.

Sizes used in calculations: 200x100x1000000

 Didn’t see performance degradation from 10.3 to 11.0. Got the following results

sgemm:

2 threads, 10.3: 0,371785 sec

2 threads, 11.0: 0,384604 sec

4 threads, 10.3: 0,198118 sec

4 threads, 11.0: 0,205439 sec

 

dgemm:

2 threads, 10.3: 0,747748 sec

2 threads, 11.0: 0,751148 sec

4 threads, 10.3: 0,395847 sec

4 threads, 11.0: 0,400287 sec

 

cgemm:

2 threads, 10.3: 1,56011 sec

2 threads, 11.0: 1,55974 sec

4 threads, 10.3: 0,885265 sec

4 threads, 11.0: 0,88249 sec

0 Kudos
Reply