Community
cancel
Showing results for 
Search instead for 
Did you mean: 
sebydocky
Beginner
239 Views

BLAS 1 Multithread

Hello,

It's possible to have BLAS 1 multithreaded functions (such ddot) with MKL ?. I tried with the MKL 10.2 in order to multithread "liblinear" but without success.... Do I need /Qopenmp flag ? (I am working on win32 & win64 plateforms with C2D or I7mobile)...

If someone have a small example ...

Regards,

Sbastien
0 Kudos
11 Replies
Gennady_F_Intel
Moderator
239 Views

Sbastien,
- yes, this routine is threaded. Please see the info about that into User's Guide, chapter 6.
- yes, the examples of this routines are available: you can find these examples in \blas\source\ ddotx.f
or

\examples\cblas\source\ cblas_ddotx.c

- you don't need to use /Qopenmp flag to compile. Please link threading libraries. Please look here to find out the recommended libraries.

--Gennady

sebydocky
Beginner
239 Views

Hello,

In the cblas_ddotx.c, ddot call is done by cblas_ddot. In liblinear, it's using directly ddot function. I think there should be any difference for multithreading ?.

I link versus mkl_core.lib, mkl_intel_c.lib and mkl_intel_thread.lib

Sbastien
sebydocky
Beginner
239 Views


Well .... as indicated I have to add /Qopenmp flag and on my C2D, now, the 2 core are used.. However the difference between sequential or multithreaded are small ...

With multi-theadings

mex -D_DENSE_REP -DBLAS -f mexopts_intel10.bat -output train_dense.dll train_dense.c linear_model_matlab.c linear.cpp tron.cpp "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_core.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_c.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_thread.lib"

tic, model{t} = train_dense(ytopic' , X , options , 'col');,toc

Elapsed time is 5.696018 seconds.


and without ...

mex -D_DENSE_REP -DBLAS -f mexopts_intel10.bat -output train_dense.dll train_dense.c linear_model_matlab.c linear.cpp tron.cpp "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_core.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_intel_c.lib" "C:\Program Files\Intel\Compiler\11.1\065\mkl\ia32\lib\mkl_sequential.lib"

tic, model{t} = train_dense(ytopic' , X , options , 'col');,toc

Elapsed time is 5.643861 seconds.


where X is a (10240 x 3588) double precision matrix.

Regards,

Sbastien

Gennady_F_Intel
Moderator
239 Views

quote:"I think there should be any difference for multithreading ?"
there is no difference between ddot and cblas_ddot from multithreading point of view.
these are different API only for the same functionality.
--Gennady
Gennady_F_Intel
Moderator
239 Views

Sbastien,
I don't see the threading library ( libiomp5md.lib)into the linking line you are using.
I don't understand, you mentioned thatX is a (10240 x 3588) double precision matrix.
What is the real number of elements of vectors in your experiments?
Could you provide more details about CPU you are working?
--Gennady
sebydocky
Beginner
239 Views

Gennady,

Yes I don't have any error without libiomp5md.lib during linking. It's important ?

liblinear use more particulary BLAS1 function with vector's dimension equal to 10240 in my given example.


First test system
C2D T7500, XP SP3 32, Intel compiler 10.1.13

Second system

I7m 720, W7 64, Intel compiler 11.165.

For this later, multithreaded is not working ... even by setting MKL_NUM_THREADS & OMP_NUM_THREADS variables.
Zhanghong_T_
Novice
239 Views

Dear Gennady,

Your link to guide the set up link libraries is very useful. However, I am confused about the item "select cluster library". What's the difference between BLACS and ScaLAPACK? To let the BLAS functions have best speed up under multiple CPUs and multiple cores, which functions should I select?

The environment of my program is:
Win7 x64 + VS2008 + IVF 11.1.065 + Intel MPI

Thanks,
Zhanghong Tang
eliosh
Beginner
239 Views

I also tried to enable multithreading in Matlab's MEX files however, it does not seem to work.

In fact those MEX files are regular shared libraries (.DLL or .SO) with a different name.
Hence, I am not sure that usual ways of enabling multiple threads will work is a shared library. I suppose it is related to the threads of the main program or something like this.

Can anyone clarify the situation or give a link to appropriate readings.


Thank you.
Gennady_F_Intel
Moderator
239 Views

Sbastien,
I don't understand which MKL version you are using?
Fact is that, we are releasing MKL either standalone or bundled with Intel Fortran Compiler versions.
- if you are using standalone version, then look at the mklsupport.txt file (/doc/mklsupport.txt) and you can see smth like:Package ID: w_mkl_p_10.2.5.035.
- with the bundled with IVF version - please see this KB where you can find which version of MKL is bundled with this version of compiler.
--Gennady
Gennady_F_Intel
Moderator
239 Views

HiZhanghong Tang,
I think it would be better if you, first of all, will read the MKL Reference Manual regarding BLACS routine description. If you will have further questions,then we will try to help.
>> To let the BLAS functions have best speed up under multiple CPUs and multiple cores, which functions should I select?
The biggest part of BLAS 1,2 and 3 levels routines ( density storage scheme ) are threaded and show very good scalablity results.
Which MKL's routines are you going to use?
--Gennady
Gennady_F_Intel
Moderator
239 Views


it may depends on such factors like:
MKL version which is used in your version of Matlab? ask Mathworks about that.
functionality..
the input size..
Reply