<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to compile blas for the R statistical project using MKL on  in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770430#M568</link>
    <description>&lt;P&gt;Thanks for the response.&lt;BR /&gt;
&lt;BR /&gt;
I have rebuttal to your 3 objections;&lt;BR /&gt;
&lt;BR /&gt;
(i) I have not suggested violating any laws. This request is posted for those
users who are working in the Windows universe and have access to the MKL. &lt;BR /&gt;
&lt;BR /&gt;
(ii) Given the above, the compiled dll was not meant for users to distribute it
on the R website-only for the private use. &lt;/P&gt;

&lt;P&gt;And why should novices not try anyway?&lt;BR /&gt;
&lt;BR /&gt;
(iii) The performance improvement (in the Windows environment) has been
significant, even whilst using one core. This has been done by a company called
Revolutions sells services around the R. The comparison is between the
standard distribution of R(available from &lt;A href="http://cran.r-project.org/)" target="_blank"&gt;http://cran.r-project.org/)&lt;/A&gt; and the compliled Intel MKL optimised version. &lt;/P&gt;&lt;BR /&gt;&lt;TABLE style="height: 84px;" class="tblGenFixed" id="tblMain_0" border="0" cellpadding="0" cellspacing="0" width="608"&gt;&lt;TBODY&gt;&lt;TR style="padding-left: 40pt;"&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s0"&gt;&lt;B&gt;Calculation&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Size&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Command&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;R 2.9.2 &lt;BR /&gt;&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Revolution
 R&lt;BR /&gt;(1 core)&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Revolution R&lt;BR /&gt;(4 
cores)&lt;/B&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Matrix Multiply&lt;BR /&gt;A'*A&lt;/TD&gt;&lt;TD class="s3"&gt;10000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;B &amp;lt;- crossprod(A)&lt;/TD&gt;&lt;TD class="s3"&gt;243 sec&lt;/TD&gt;&lt;TD class="s3"&gt;22 sec&lt;/TD&gt;&lt;TD class="s3"&gt;5.9 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Cholesky Factorization&lt;/TD&gt;&lt;TD class="s3"&gt;5000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;C &amp;lt;- chol(B)&lt;/TD&gt;&lt;TD class="s3"&gt;23 sec&lt;/TD&gt;&lt;TD class="s3"&gt;3.8 sec&lt;/TD&gt;&lt;TD class="s3"&gt;1.1 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Singular 
Value Decomposition&lt;/TD&gt;&lt;TD class="s3"&gt;5000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;S 
&amp;lt;- svd (A,nu=0,nv=0)&lt;/TD&gt;&lt;TD class="s3"&gt;62 sec&lt;/TD&gt;&lt;TD class="s3"&gt;13 
sec&lt;/TD&gt;&lt;TD class="s3"&gt;4.9 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Principal Components 
Analysis&lt;/TD&gt;&lt;TD class="s3"&gt;10000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;P &amp;lt;- 
prcomp(A)&lt;/TD&gt;&lt;TD class="s3"&gt;237 sec&lt;/TD&gt;&lt;TD class="s3"&gt;41 sec&lt;/TD&gt;&lt;TD class="s3"&gt;15.6 sec&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;BR /&gt;Source: &lt;A href="http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html" target="_blank"&gt;http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Also, there are non-Intel optimised rblas.dll's available from &lt;A href="http://cran.r-project.org/bin/windows/contrib/ATLAS/" target="_blank"&gt;http://cran.r-project.org/bin/windows/contrib/ATLAS/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Installing a version of rblas.dll for your processor will result in significant speed improvement. It appears that it would be faster with Intel MKL.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;So if there is anyone, perhaps from Intel MKL team that could provide the guidance required, please do!</description>
    <pubDate>Mon, 02 Aug 2010 09:33:06 GMT</pubDate>
    <dc:creator>stochos</dc:creator>
    <dc:date>2010-08-02T09:33:06Z</dc:date>
    <item>
      <title>How to compile blas for the R statistical project using MKL on Windows?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770428#M566</link>
      <description>Hi&lt;BR /&gt;
&lt;BR /&gt;
It would be most appreciated if someone could provide detailed instructions for
a novice on using (or 'linking') the MKL to compile to create an optimised version of the BLAS
for the open source R statistical project, preferably using Visual Studio or
the default gcc (for Windows). Most of the users of R would have no idea where to start with this.&lt;BR /&gt;
&lt;BR /&gt;
The aim is to replace the default rblas.dll with the optimised one compiled
using MKL. &lt;BR /&gt;
&lt;BR /&gt;
Although this may seem like a trivial question to the MKL researchers, I
believe that this would be very useful to the huge and growing community of R
users, and hence the request for detailed guidance. &lt;BR /&gt;
&lt;BR /&gt;
The R project (source code and binary) is downloadable from &lt;A href="http://cran.r-project.org/" target="_blank"&gt;http://cran.r-project.org/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;
&lt;BR /&gt;
&lt;BR /&gt;
Thanks&lt;BR /&gt;&lt;BR /&gt;Stochos&lt;BR /&gt;</description>
      <pubDate>Fri, 30 Jul 2010 15:18:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770428#M566</guid>
      <dc:creator>stochos</dc:creator>
      <dc:date>2010-07-30T15:18:56Z</dc:date>
    </item>
    <item>
      <title>How to compile blas for the R statistical project using MKL on</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770429#M567</link>
      <description>These three objections have to be resolved before undertaking what you propose:&lt;BR /&gt;&lt;BR /&gt; (i) R is covered by GPL2. MKL is proprietary, and exists only for systems built with x86, x86-64 and ia64 processors..&lt;BR /&gt;&lt;BR /&gt; (ii) The creation of a shared library (DLL) for use with a software system that is used by thousands of users is not a task to be entrusted to novices.&lt;BR /&gt;&lt;BR /&gt; (iii) R is an interpretive language oriented towards data manipulation, statistics and visualization. Therefore, it is doubtful that the use of non-optimized BLAS causes noticeable slowdown to the majority of users, or that using MKL in place of the non-optimized BLAS would provide significant improvement in performance.</description>
      <pubDate>Sun, 01 Aug 2010 22:20:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770429#M567</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2010-08-01T22:20:03Z</dc:date>
    </item>
    <item>
      <title>How to compile blas for the R statistical project using MKL on</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770430#M568</link>
      <description>&lt;P&gt;Thanks for the response.&lt;BR /&gt;
&lt;BR /&gt;
I have rebuttal to your 3 objections;&lt;BR /&gt;
&lt;BR /&gt;
(i) I have not suggested violating any laws. This request is posted for those
users who are working in the Windows universe and have access to the MKL. &lt;BR /&gt;
&lt;BR /&gt;
(ii) Given the above, the compiled dll was not meant for users to distribute it
on the R website-only for the private use. &lt;/P&gt;

&lt;P&gt;And why should novices not try anyway?&lt;BR /&gt;
&lt;BR /&gt;
(iii) The performance improvement (in the Windows environment) has been
significant, even whilst using one core. This has been done by a company called
Revolutions sells services around the R. The comparison is between the
standard distribution of R(available from &lt;A href="http://cran.r-project.org/)" target="_blank"&gt;http://cran.r-project.org/)&lt;/A&gt; and the compliled Intel MKL optimised version. &lt;/P&gt;&lt;BR /&gt;&lt;TABLE style="height: 84px;" class="tblGenFixed" id="tblMain_0" border="0" cellpadding="0" cellspacing="0" width="608"&gt;&lt;TBODY&gt;&lt;TR style="padding-left: 40pt;"&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s0"&gt;&lt;B&gt;Calculation&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Size&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Command&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;R 2.9.2 &lt;BR /&gt;&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Revolution
 R&lt;BR /&gt;(1 core)&lt;/B&gt;&lt;/TD&gt;&lt;TD class="s1"&gt;&lt;B&gt;Revolution R&lt;BR /&gt;(4 
cores)&lt;/B&gt;&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Matrix Multiply&lt;BR /&gt;A'*A&lt;/TD&gt;&lt;TD class="s3"&gt;10000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;B &amp;lt;- crossprod(A)&lt;/TD&gt;&lt;TD class="s3"&gt;243 sec&lt;/TD&gt;&lt;TD class="s3"&gt;22 sec&lt;/TD&gt;&lt;TD class="s3"&gt;5.9 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Cholesky Factorization&lt;/TD&gt;&lt;TD class="s3"&gt;5000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;C &amp;lt;- chol(B)&lt;/TD&gt;&lt;TD class="s3"&gt;23 sec&lt;/TD&gt;&lt;TD class="s3"&gt;3.8 sec&lt;/TD&gt;&lt;TD class="s3"&gt;1.1 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Singular 
Value Decomposition&lt;/TD&gt;&lt;TD class="s3"&gt;5000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;S 
&amp;lt;- svd (A,nu=0,nv=0)&lt;/TD&gt;&lt;TD class="s3"&gt;62 sec&lt;/TD&gt;&lt;TD class="s3"&gt;13 
sec&lt;/TD&gt;&lt;TD class="s3"&gt;4.9 sec&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD class="hd"&gt;&lt;BR /&gt;&lt;/TD&gt;&lt;TD class="s2"&gt;Principal Components 
Analysis&lt;/TD&gt;&lt;TD class="s3"&gt;10000x5000&lt;/TD&gt;&lt;TD class="s3"&gt;P &amp;lt;- 
prcomp(A)&lt;/TD&gt;&lt;TD class="s3"&gt;237 sec&lt;/TD&gt;&lt;TD class="s3"&gt;41 sec&lt;/TD&gt;&lt;TD class="s3"&gt;15.6 sec&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;BR /&gt;Source: &lt;A href="http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html" target="_blank"&gt;http://blog.revolutionanalytics.com/2010/06/performance-benefits-of-multithreaded-r.html&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Also, there are non-Intel optimised rblas.dll's available from &lt;A href="http://cran.r-project.org/bin/windows/contrib/ATLAS/" target="_blank"&gt;http://cran.r-project.org/bin/windows/contrib/ATLAS/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Installing a version of rblas.dll for your processor will result in significant speed improvement. It appears that it would be faster with Intel MKL.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;So if there is anyone, perhaps from Intel MKL team that could provide the guidance required, please do!</description>
      <pubDate>Mon, 02 Aug 2010 09:33:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770430#M568</guid>
      <dc:creator>stochos</dc:creator>
      <dc:date>2010-08-02T09:33:06Z</dc:date>
    </item>
    <item>
      <title>How to compile blas for the R statistical project using MKL on</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770431#M569</link>
      <description>It would be quite useful to have an additional column in your table for the Atlas-based rblas.dll, in order to address the costs versus benefits issue for making an MKL-based rblas.dll. Do you have this information?&lt;BR /&gt;</description>
      <pubDate>Mon, 02 Aug 2010 12:53:23 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770431#M569</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2010-08-02T12:53:23Z</dc:date>
    </item>
    <item>
      <title>How to compile blas for the R statistical project using MKL on</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770432#M570</link>
      <description>Unfortunately not. Whilst it may not be directly comparable, Yu Sung-Su reports on his blog the following, using R:&lt;BR /&gt;&lt;BR /&gt;------------------------------------------------------------------------------------------------------------------------&lt;BR /&gt;&lt;P&gt;Here is an example to test:&lt;/P&gt;

&lt;PRE&gt;require(Matrix)&lt;BR /&gt;set.seed(123)&lt;BR /&gt;X &amp;lt;- Matrix(rnorm(1e6), 1000)&lt;BR /&gt;print(system.time(for(i in 1:25) X%*%X))&lt;BR /&gt;print(system.time(for(i in 1:25) solve(X)))&lt;BR /&gt;print(system.time(for(i in 1:10) svd(X)))&lt;BR /&gt;&lt;/PRE&gt;

&lt;P&gt;Here is a test result on my machine (Intel Pentium  M process 
1.73GHz with 1 GB RAM).&lt;/P&gt;

&lt;P&gt;&lt;B&gt;Default Rblas.dll&lt;/B&gt;&lt;/P&gt;

&lt;PRE&gt;&amp;gt; print(system.time(for(i in 1:25) X%*%X))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt; 114.19    0.38  121.04 &lt;BR /&gt;&amp;gt; print(system.time(for(i in 1:25) solve(X)))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt;  87.03    0.28   89.31 &lt;BR /&gt;&amp;gt; print(system.time(for(i in 1:10) svd(X)))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt; 232.29    1.44  242.64 &lt;BR /&gt;&lt;/PRE&gt;
&lt;B&gt;
New Rblas.dll&lt;/B&gt;

&lt;PRE&gt;&amp;gt; print(system.time(for(i in 1:25) X%*%X))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt;  37.18    0.36   37.89 &lt;BR /&gt;&amp;gt; print(system.time(for(i in 1:25) solve(X)))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt;  30.62    0.56   31.78 &lt;BR /&gt;&amp;gt; print(system.time(for(i in 1:10) svd(X)))&lt;BR /&gt;   user  system elapsed &lt;BR /&gt; 102.89    2.17  107.17 &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Source: &lt;A href="http://www.stat.columbia.edu/~cook/movabletype/archives/2008/06/a_trick_to_spee.html" target="_blank"&gt;http://www.stat.columbia.edu/~cook/movabletype/archives/2008/06/a_trick_to_spee.html&lt;/A&gt;&lt;/PRE&gt;--------------------------------------------------------------------------------------------------------------------&lt;BR /&gt;&lt;BR /&gt;Hope this helps make the case. The question is then to what extent is Intel MKL better than open source alternatives? Also, we are told that the MKL has been optimised for multithreading, so there might be speed improvement from this as well.</description>
      <pubDate>Mon, 02 Aug 2010 13:14:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-compile-blas-for-the-R-statistical-project-using-MKL-on/m-p/770432#M570</guid>
      <dc:creator>stochos</dc:creator>
      <dc:date>2010-08-02T13:14:49Z</dc:date>
    </item>
  </channel>
</rss>

