Intel® oneAPI Math Kernel Library
Ask questions and share information with other developers who use Intel® Math Kernel Library.
6956 Discussions

How to compile blas for the R statistical project using MKL on Windows?

stochos
Beginner
551 Views
Hi

It would be most appreciated if someone could provide detailed instructions for a novice on using (or 'linking') the MKL to compile to create an optimised version of the BLAS for the open source R statistical project, preferably using Visual Studio or the default gcc (for Windows). Most of the users of R would have no idea where to start with this.

The aim is to replace the default rblas.dll with the optimised one compiled using MKL.

Although this may seem like a trivial question to the MKL researchers, I believe that this would be very useful to the huge and growing community of R users, and hence the request for detailed guidance.

The R project (source code and binary) is downloadable from http://cran.r-project.org/




Thanks

Stochos
0 Kudos
4 Replies
mecej4
Honored Contributor III
551 Views
These three objections have to be resolved before undertaking what you propose:

(i) R is covered by GPL2. MKL is proprietary, and exists only for systems built with x86, x86-64 and ia64 processors..

(ii) The creation of a shared library (DLL) for use with a software system that is used by thousands of users is not a task to be entrusted to novices.

(iii) R is an interpretive language oriented towards data manipulation, statistics and visualization. Therefore, it is doubtful that the use of non-optimized BLAS causes noticeable slowdown to the majority of users, or that using MKL in place of the non-optimized BLAS would provide significant improvement in performance.
0 Kudos
stochos
Beginner
551 Views

Thanks for the response.

I have rebuttal to your 3 objections;

(i) I have not suggested violating any laws. This request is posted for those users who are working in the Windows universe and have access to the MKL.

(ii) Given the above, the compiled dll was not meant for users to distribute it on the R website-only for the private use.

And why should novices not try anyway?

(iii) The performance improvement (in the Windows environment) has been significant, even whilst using one core. This has been done by a company called Revolutions sells services around the R. The comparison is between the standard distribution of R(available from http://cran.r-project.org/) and the compliled Intel MKL optimised version.



CalculationSizeCommandR 2.9.2
Revolution R
(1 core)
Revolution R
(4 cores)

Matrix Multiply
A'*A
10000x5000B <- crossprod(A)243 sec22 sec5.9 sec

Cholesky Factorization5000x5000C <- chol(B)23 sec3.8 sec1.1 sec

Singular Value Decomposition5000x5000S <- svd (A,nu=0,nv=0)62 sec13 sec4.9 sec

Principal Components Analysis10000x5000P <- prcomp(A)237 sec41 sec15.6 sec

Source: http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html

Also, there are non-Intel optimised rblas.dll's available from http://cran.r-project.org/bin/windows/contrib/ATLAS/

Installing a version of rblas.dll for your processor will result in significant speed improvement. It appears that it would be faster with Intel MKL.


So if there is anyone, perhaps from Intel MKL team that could provide the guidance required, please do!
0 Kudos
mecej4
Honored Contributor III
551 Views
It would be quite useful to have an additional column in your table for the Atlas-based rblas.dll, in order to address the costs versus benefits issue for making an MKL-based rblas.dll. Do you have this information?
0 Kudos
stochos
Beginner
551 Views
Unfortunately not. Whilst it may not be directly comparable, Yu Sung-Su reports on his blog the following, using R:

------------------------------------------------------------------------------------------------------------------------

Here is an example to test:

require(Matrix)
set.seed(123)
X <- Matrix(rnorm(1e6), 1000)
print(system.time(for(i in 1:25) X%*%X))
print(system.time(for(i in 1:25) solve(X)))
print(system.time(for(i in 1:10) svd(X)))

Here is a test result on my machine (Intel Pentium M process 1.73GHz with 1 GB RAM).

Default Rblas.dll

> print(system.time(for(i in 1:25) X%*%X))
user system elapsed
114.19 0.38 121.04
> print(system.time(for(i in 1:25) solve(X)))
user system elapsed
87.03 0.28 89.31
> print(system.time(for(i in 1:10) svd(X)))
user system elapsed
232.29 1.44 242.64
New Rblas.dll
> print(system.time(for(i in 1:25) X%*%X))
user system elapsed
37.18 0.36 37.89
> print(system.time(for(i in 1:25) solve(X)))
user system elapsed
30.62 0.56 31.78
> print(system.time(for(i in 1:10) svd(X)))
user system elapsed
102.89 2.17 107.17


Source: http://www.stat.columbia.edu/~cook/movabletype/archives/2008/06/a_trick_to_spee.html
--------------------------------------------------------------------------------------------------------------------

Hope this helps make the case. The question is then to what extent is Intel MKL better than open source alternatives? Also, we are told that the MKL has been optimised for multithreading, so there might be speed improvement from this as well.
0 Kudos
Reply