<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Could you share the example in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144708#M26608</link>
    <description>&lt;P&gt;Could you share the example of the code you use for this perf comparision?&amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;how do link? OS?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;in any case if you see if the same routine from v.2018 works slower then from v.2017 - this is the problem.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Fri, 20 Oct 2017 14:58:14 GMT</pubDate>
    <dc:creator>Gennady_F_Intel</dc:creator>
    <dc:date>2017-10-20T14:58:14Z</dc:date>
    <item>
      <title>SVD speed of 'small' matrices in MKL 2018_0_124</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144706#M26606</link>
      <description>&lt;P&gt;I'm using SVD during some least-square fitting, typically operating on spectral data (1000-2000 data points) and fitting with very few parameters (2-5).&lt;/P&gt;

&lt;P&gt;For this, I'm generally using a direct implementaion of the SVD routines from the "numerical recipes" (single-threaded).&lt;/P&gt;

&lt;P&gt;When I started needing SVDs in other areas (bigger matrices with a less extreme aspect ratio, typtically ~ 10000 x 1000) I started using MKL Lapacke, currenlty using version 2017_4_210 and here the routines greatly outperform the NR routines.&lt;/P&gt;

&lt;P&gt;So I also started using them for the fitting as described above. However, when applying it to the "extreme" data of only very few parameters ( typical matrix size 2048 x 3 ), the Lapacke routines fell behind and the NR routines are just faster.&lt;/P&gt;

&lt;P&gt;Just as a "guideline": Running the same (iterative) fitting on a typical standard data-set, my profile tells me I'm staying with the SVD-routines for about&amp;nbsp; 4sec using NR routines and for about 7sec with the MKL routines)&lt;/P&gt;

&lt;P&gt;Now, when MKL 2018 was announced a month ago, I was quite excited to read in the&amp;nbsp;&lt;EM&gt;Release Notes &lt;/EM&gt;(&lt;A href="https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2018-release-notes" target="_blank" title="Release Notes"&gt;https://software.intel.com/en-us/articles/intel-math-kernel-library-intel-mkl-2018-release-notes&lt;/A&gt;):&lt;/P&gt;

&lt;P&gt;&lt;STRONG&gt;LAPACK:&lt;/STRONG&gt;&lt;/P&gt;

&lt;UL&gt;
	&lt;LI&gt;&lt;STRONG&gt;Added the following improvements and optimizations for small matrices (N&amp;lt;16):&lt;/STRONG&gt;&lt;/LI&gt;
	&lt;LI&gt;&lt;STRONG&gt;Added ?gesvd, ?geqr/?gemqr, ?gelq/?gemlq&amp;nbsp; optimizations for tall-and-skinny/short-and-wide matrice&lt;/STRONG&gt;&lt;/LI&gt;
&lt;/UL&gt;

&lt;P&gt;So I gave it a try, but was quite disappointed. Not only did the NR still outperfrom MKL routines, but for reasons not clear to me, the performance actually &lt;EM&gt;dropped&lt;/EM&gt; significantly in the 2018_0_124 MKL compared to the 2017_4_210 version.&lt;/P&gt;

&lt;P&gt;The same data for guideline:&lt;BR /&gt;
	- NR routines: 4sec&lt;BR /&gt;
	- MKL 2017: 7sec&lt;BR /&gt;
	- MKL 2018: 14sec&lt;/P&gt;

&lt;P&gt;The only changes I did when comparing both variantes was to re-compile/link with the newer version and use the according new version DLLs.&lt;BR /&gt;
	Did I miss something? Or did I misunderstand the release notes? Does anybody have some other comparative data for running SVDs on matrices of size ( 2048 x 3 ) which will help me figure out whether it is problem of the lirbary or of my implementation of it?&lt;/P&gt;

&lt;P&gt;I ran my tests on 8 cores enabled&amp;nbsp; on a (4 core hyper-threaded i7-4712 HQ).&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 07:59:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144706#M26606</guid>
      <dc:creator>MiauCat</dc:creator>
      <dc:date>2017-10-20T07:59:16Z</dc:date>
    </item>
    <item>
      <title>I'm also copying in here a</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144707#M26607</link>
      <description>&lt;P&gt;I'm also copying in here a few "measured" speeds for matrices of specific size.&lt;/P&gt;

&lt;P&gt;The data in the matrices is uniform-random between 0 and 1000.&lt;/P&gt;

&lt;P&gt;I'm using double (8 byte) floating point data arrays.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 100 x 5 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.470 sec {&amp;nbsp;&amp;nbsp; 0.000147 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.598 sec {&amp;nbsp;&amp;nbsp; 5.98e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.637 sec {&amp;nbsp;&amp;nbsp; 6.37e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.600 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.630 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.3e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;SPEED for [ 100 x 5 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.470 sec {&amp;nbsp;&amp;nbsp; 0.000147 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.597 sec {&amp;nbsp;&amp;nbsp; 5.97e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.604 sec {&amp;nbsp;&amp;nbsp; 6.04e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.601 sec {&amp;nbsp;&amp;nbsp; 6.01e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.606 sec {&amp;nbsp;&amp;nbsp; 6.06e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	************************************************************&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;SPEED for [ 5 x 100 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.208 sec {&amp;nbsp;&amp;nbsp; 2.08e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.516 sec {&amp;nbsp;&amp;nbsp; 5.16e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.836 sec {&amp;nbsp;&amp;nbsp; 8.36e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.601 sec {&amp;nbsp;&amp;nbsp; 6.01e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.781 sec {&amp;nbsp;&amp;nbsp; 7.81e-05 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 5 x 100 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.215 sec {&amp;nbsp;&amp;nbsp; 2.15e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.510 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp; 5.1e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.832 sec {&amp;nbsp;&amp;nbsp; 8.32e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.600 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 6e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.741 sec {&amp;nbsp;&amp;nbsp; 7.41e-05 sec/op }&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;&lt;BR /&gt;
	************************************************************&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	&amp;nbsp;SPEED for [ 5 x 1000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.860 sec {&amp;nbsp;&amp;nbsp; 0.000186 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.260 sec {&amp;nbsp;&amp;nbsp; 0.000126 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.720 sec {&amp;nbsp;&amp;nbsp; 0.000272 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.540 sec {&amp;nbsp;&amp;nbsp; 0.000254 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3.650 sec {&amp;nbsp;&amp;nbsp; 0.000365 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 5 x 1000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.940 sec {&amp;nbsp;&amp;nbsp; 0.000194 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.240 sec {&amp;nbsp;&amp;nbsp; 0.000124 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.690 sec {&amp;nbsp;&amp;nbsp; 0.000269 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.520 sec {&amp;nbsp;&amp;nbsp; 0.000252 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3.620 sec {&amp;nbsp;&amp;nbsp; 0.000362 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp;&lt;BR /&gt;
	************************************************************&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 3 x 1000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.657 sec {&amp;nbsp;&amp;nbsp; 6.57e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.740 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp; 7.4e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.090 sec {&amp;nbsp;&amp;nbsp; 0.000209 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.630 sec {&amp;nbsp;&amp;nbsp; 0.000163 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.910 sec {&amp;nbsp;&amp;nbsp; 0.000291 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 3 x 1000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.669 sec {&amp;nbsp;&amp;nbsp; 6.69e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.754 sec {&amp;nbsp;&amp;nbsp; 7.54e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.070 sec {&amp;nbsp;&amp;nbsp; 0.000207 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2017:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.690 sec {&amp;nbsp;&amp;nbsp; 0.000169 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.870 sec {&amp;nbsp;&amp;nbsp; 0.000287 sec/op }&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 11:19:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144707#M26607</guid>
      <dc:creator>MiauCat</dc:creator>
      <dc:date>2017-10-20T11:19:33Z</dc:date>
    </item>
    <item>
      <title>Could you share the example</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144708#M26608</link>
      <description>&lt;P&gt;Could you share the example of the code you use for this perf comparision?&amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;how do link? OS?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;in any case if you see if the same routine from v.2018 works slower then from v.2017 - this is the problem.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 20 Oct 2017 14:58:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144708#M26608</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2017-10-20T14:58:14Z</dc:date>
    </item>
    <item>
      <title>Indeed, we introduced a</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144709#M26609</link>
      <description>&lt;P&gt;Indeed, we introduced a degradation in MKL 2018. We will try to fix the problem ASAP, and will let you know when the fix is available.&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Regards,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Konstantin&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 27 Oct 2017 23:15:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144709#M26609</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2017-10-27T23:15:42Z</dc:date>
    </item>
    <item>
      <title>Hello,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144710#M26610</link>
      <description>&lt;P&gt;I would like to know if this problem was fixed now (version 2018 update 3).&lt;/P&gt;

&lt;P&gt;thank you&lt;/P&gt;</description>
      <pubDate>Sun, 05 Aug 2018 19:42:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144710#M26610</guid>
      <dc:creator>jr___shishu</dc:creator>
      <dc:date>2018-08-05T19:42:00Z</dc:date>
    </item>
    <item>
      <title>yes, please try the latest</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144711#M26611</link>
      <description>&lt;P&gt;yes, please try the latest update and let us know the result&lt;/P&gt;</description>
      <pubDate>Tue, 07 Aug 2018 14:55:56 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144711#M26611</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2018-08-07T14:55:56Z</dc:date>
    </item>
    <item>
      <title>One thing I would suggest is</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144712#M26612</link>
      <description>&lt;P&gt;One thing I would suggest is that you be careful that the work arrays assigned to MKL are of sufficient size. If they are too small this can have a very significant effect on performance.&lt;/P&gt;</description>
      <pubDate>Fri, 10 Aug 2018 19:09:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144712#M26612</guid>
      <dc:creator>AndrewC</dc:creator>
      <dc:date>2018-08-10T19:09:00Z</dc:date>
    </item>
    <item>
      <title>I have done some comparison</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144713#M26613</link>
      <description>&lt;P&gt;I have done some comparison with the 2018_3_210 version now, and I can confirm that the slowdown of the 2018_0_124 version for small matrices has been fixed.&lt;/P&gt;

&lt;P&gt;2018_3_210 compares pretty much to the speeds of 2017_4_210 for these matrices. ( 5x100, 5x1000, 5x3000 )&lt;/P&gt;

&lt;P&gt;However, the single threaded NumericalRecepies still beat MKL at these scenarios by far, so I'm still using that for some simple fitting.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 5 x 100 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.224 sec {&amp;nbsp;&amp;nbsp; 2.24e-05 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.64 sec {&amp;nbsp;&amp;nbsp; 0.000164 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.54 sec {&amp;nbsp;&amp;nbsp; 0.000154 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 5 x 1000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 1.89 sec {&amp;nbsp;&amp;nbsp; 0.000189 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 2.4 sec {&amp;nbsp;&amp;nbsp;&amp;nbsp; 0.00024 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 3.34 sec {&amp;nbsp;&amp;nbsp; 0.000334 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;SPEED for [ 5 x 3000 ]: Averaged over 10000 iterations&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; NR&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; :&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 5.47 sec {&amp;nbsp;&amp;nbsp; 0.000547 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL svd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 4.29 sec {&amp;nbsp;&amp;nbsp; 0.000429 sec/op }&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp; &amp;nbsp; MKL sdd 2018_3_210:&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 7.45 sec {&amp;nbsp;&amp;nbsp; 0.000745 sec/op }&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Edit: I should add that the absolute comparison numbers differ from those posted a year ago. I compared 2018_3_210, 2017_4_210 and 2018_0_124 completly anew with my system. And there 2017 &amp;amp; 2018 are similar whereas 2018_0_124 is still worse than both. The values above are for my new setup.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Sep 2018 14:35:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144713#M26613</guid>
      <dc:creator>MiauCat</dc:creator>
      <dc:date>2018-09-06T14:35:00Z</dc:date>
    </item>
    <item>
      <title>Quote:vasci_ wrote:</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144714#M26614</link>
      <description>&lt;P&gt;&lt;/P&gt;&lt;BLOCKQUOTE&gt;vasci_ wrote:&lt;BR /&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;One thing I would suggest is that you be careful that the work arrays assigned to MKL are of sufficient size. If they are too small this can have a very significant effect on performance.&lt;/P&gt;

&lt;P&gt;&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;

&lt;P&gt;Aren't the work-arrays fixed anyway? I'm a bit surprised here. Could you give an example of a good vs a bad call?&lt;/P&gt;

&lt;P&gt;Following &lt;A href="https://software.intel.com/en-us/mkl-developer-reference-c-gesvd"&gt;https://software.intel.com/en-us/mkl-developer-reference-c-gesvd&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;wouldn't I, for a 5x1000 matrix not just call&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; LAPACKE_dgesvd( matrix_layout, jobu, jobvt, m_, n_, a, lda_, s, u, ldu_, vt, ldvt_, superb );&lt;/P&gt;

&lt;P&gt;with:&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; matrix_layout = LAPACK_ROW_MAJOR&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; jobu = 'O'&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; jobu = 'S'&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; m_ = 1000&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; n_ = 5&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; lda_ = 5&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; ldu_ = 5&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; ldvt_ = 5&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; a = array[5x1000]&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; u = array[5x1000]&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; s = array[5]&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; vt = array[5x5]&lt;BR /&gt;
	&amp;nbsp;&amp;nbsp;&amp;nbsp; superb&amp;nbsp; == array[4]&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Would making any of the arrays bigger make a difference here?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 06 Sep 2018 14:54:22 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/SVD-speed-of-small-matrices-in-MKL-2018-0-124/m-p/1144714#M26614</guid>
      <dc:creator>MiauCat</dc:creator>
      <dc:date>2018-09-06T14:54:22Z</dc:date>
    </item>
  </channel>
</rss>

