It's possible to have BLAS 1 multithreaded functions (such ddot) with MKL ?. I tried with the MKL 10.2 in order to multithread "liblinear" but without success.... Do I need /Qopenmp flag ? (I am working on win32 & win64 plateforms with C2D or I7mobile)...
If someone have a small example ...
Regards,
Sbastien
链接已复制
- you don't need to use /Qopenmp flag to compile. Please link threading libraries. Please look here to find out the recommended libraries.
--Gennady
In the cblas_ddotx.c, ddot call is done by cblas_ddot. In liblinear, it's using directly ddot function. I think there should be any difference for multithreading ?.
I link versus mkl_core.lib, mkl_intel_c.lib and mkl_intel_thread.lib
Sbastien
Well .... as indicated I have to add /Qopenmp flag and on my C2D, now, the 2 core are used.. However the difference between sequential or multithreaded are small ...
With multi-theadings
mex -D_DENSE_REP -DBLAS -f mexopts_intel10.bat -output train_dense.dll train_dense.c linear_model_matlab.c linear.cpp tron.cpp "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_core.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_c.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_thread.lib"
tic, model{t} = train_dense(ytopic' , X , options , 'col');,toc
Elapsed time is 5.696018 seconds.
and without ...
mex -D_DENSE_REP -DBLAS -f mexopts_intel10.bat -output train_dense.dll train_dense.c linear_model_matlab.c linear.cpp tron.cpp "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_core.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_c.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_sequential.lib"
tic, model{t} = train_dense(ytopic' , X , options , 'col');,toc
Elapsed time is 5.643861 seconds.
where X is a (10240 x 3588) double precision matrix.
Regards,
Sbastien
Yes I don't have any error without libiomp5md.lib during linking. It's important ?
liblinear use more particulary BLAS1 function with vector's dimension equal to 10240 in my given example.
First test system
C2D T7500, XP SP3 32, Intel compiler 10.1.13
Second system
I7m 720, W7 64, Intel compiler 11.165.
For this later, multithreaded is not working ... even by setting MKL_NUM_THREADS & OMP_NUM_THREADS variables.
Your link to guide the set up link libraries is very useful. However, I am confused about the item "select cluster library". What's the difference between BLACS and ScaLAPACK? To let the BLAS functions have best speed up under multiple CPUs and multiple cores, which functions should I select?
The environment of my program is:
Win7 x64 + VS2008 + IVF 11.1.065 + Intel MPI
Thanks,
Zhanghong Tang
In fact those MEX files are regular shared libraries (.DLL or .SO) with a different name.
Hence, I am not sure that usual ways of enabling multiple threads will work is a shared library. I suppose it is related to the threads of the main program or something like this.
Can anyone clarify the situation or give a link to appropriate readings.
Thank you.