<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Regarding the speed of the program dsptrd Intel MKL in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767407#M343</link>
    <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1281677187609="63" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=312233" href="https://community.intel.com/en-us/profile/312233/" class="basic"&gt;yuriisig&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;part of the problem is only on the two processor cores&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;If you take a large matrix (eg, 40000 * 40000), the Task Manager shows that more than 3 / 4 time &lt;STRONG&gt;&lt;EM&gt;dsptrd&lt;/EM&gt;&lt;/STRONG&gt; Intel MKL believes only one processor core. It turns out that my &lt;STRONG&gt;&lt;EM&gt;dsptrd&lt;/EM&gt;&lt;/STRONG&gt; faster at 115%.&lt;/P&gt;</description>
    <pubDate>Fri, 13 Aug 2010 05:35:40 GMT</pubDate>
    <dc:creator>yuriisig</dc:creator>
    <dc:date>2010-08-13T05:35:40Z</dc:date>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767401#M337</link>
      <description>&lt;P&gt;Iguess my dear colleagues are able to explain suchbig differences in the speed of calculations (&amp;gt;96%). Hardware configuration:&lt;STRONG&gt; i7 860&lt;/STRONG&gt; processor (Speed: &lt;STRONG&gt;2.80 GHz&lt;/STRONG&gt;), Motherboard &lt;A href="http://www.intel.com/cd/products/services/emea/rus/motherboards/desktop/DP55KG/index.htm"&gt;&lt;B&gt;DP55KG,&lt;/B&gt;&lt;/A&gt;&lt;STRONG&gt;DDR31333 MHz&lt;/STRONG&gt; (&lt;STRONG&gt;8 GB&lt;/STRONG&gt;), &lt;SPAN style="font-family: serif; color: #800000;"&gt;OS Windows XP Professional x64 Edition SP2&lt;/SPAN&gt;,&lt;STRONG&gt;the Intel MKL 10.3 Beta&lt;/STRONG&gt;, &lt;STRONG&gt;EM64T&lt;/STRONG&gt;,&lt;STRONG&gt;HT&lt;/STRONG&gt; off.&lt;BR /&gt;(updated &lt;STRONG&gt;&lt;EM&gt;04/09/2010&lt;/EM&gt;&lt;/STRONG&gt;)&lt;/P&gt;&lt;BLOCKQUOTE&gt;&lt;P&gt;//icl /O2 comparision_dsptrd.c /link mkl_intel_lp64.lib mkl_intel_thread.lib mkl_core.lib libiomp5md.lib sigal.lib&lt;/P&gt;&lt;P&gt;#include &lt;TIME.H&gt;&lt;BR /&gt;#include &lt;STDIO.H&gt;&lt;BR /&gt;#include &lt;MALLOC.H&gt;&lt;/MALLOC.H&gt;&lt;/STDIO.H&gt;&lt;/TIME.H&gt;&lt;/P&gt;&lt;P&gt;#include &lt;MKL_LAPACK.H&gt;&lt;BR /&gt;#include &lt;MKL_BLAS.H&gt;&lt;BR /&gt;#include &lt;MKL_TYPES.H&gt;&lt;/MKL_TYPES.H&gt;&lt;/MKL_BLAS.H&gt;&lt;/MKL_LAPACK.H&gt;&lt;/P&gt;&lt;P&gt;int main() { &lt;BR /&gt; int n;&lt;BR /&gt; double *ap;&lt;BR /&gt; double *d;&lt;BR /&gt; double *e;&lt;BR /&gt; double *tau;&lt;BR /&gt; int info;&lt;BR /&gt; &lt;BR /&gt; clock_t t_begin;&lt;BR /&gt; int j, k, i__;&lt;BR /&gt; &lt;BR /&gt; for (n = 43000; n &amp;gt;= 1000; n -= 1000) {&lt;BR /&gt; ap = (double*) malloc(n * (n + 1) / 2 * sizeof(double));&lt;BR /&gt; d = (double*) malloc(n * sizeof(double));&lt;BR /&gt; e = (double*) malloc((n - 1) * sizeof(double));&lt;BR /&gt; tau = (double*) malloc((n - 1) * sizeof( double));&lt;BR /&gt; if (!ap || !d || !e || !tau) {&lt;BR /&gt; printf("Not enough memory to allocate buffer\\n");&lt;BR /&gt; exit(1);&lt;BR /&gt; } &lt;BR /&gt; i__ = 0;&lt;BR /&gt; for (j = 0; j &amp;lt; n; j++) {&lt;BR /&gt; for (k = j; k &amp;lt; n; k++) {&lt;BR /&gt; ap[i__++] = (double)((k + 1) * 100 + (j + 1));&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt; t_begin = clock();&lt;BR /&gt; dsptrd_("L", &amp;amp;n, ap, d, e, tau, &amp;amp;info);&lt;BR /&gt; printf("n=%5d The time was dsptrd Intel MKL: %8d ms. info=%d\\n", n, clock() - t_begin, info);&lt;BR /&gt; i__ = 0;&lt;BR /&gt; for (j = 0; j &amp;lt; n; j++) {&lt;BR /&gt; for (k = j; k &amp;lt; n; k++) {&lt;BR /&gt; ap[i__++] = (double)((k + 1) * 100 + (j + 1));&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt; t_begin = clock();&lt;BR /&gt; dsptrd_sig("L", &amp;amp;n, ap, d, e, tau, &amp;amp;info);&lt;BR /&gt; printf("n=%5d The time was my dsptrd: %8d ms. info=%d\\n\\n", n, clock() - t_begin, info);&lt;BR /&gt; free(tau);&lt;BR /&gt; free(e);&lt;BR /&gt; free(d);&lt;BR /&gt; free(ap);&lt;BR /&gt; }&lt;BR /&gt; return 0;&lt;BR /&gt; }&lt;/P&gt;&lt;/BLOCKQUOTE&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;n=43000 The time was dsytrd Intel MKL: 14413547 ms. info=0&lt;BR /&gt;n=43000 The time was my dsptrd : 7329828 ms. info=0&lt;BR /&gt;..........................................................&lt;BR /&gt;..........................................................&lt;BR /&gt;n=35000 The time was dsytrd Intel MKL: 7741563 ms. info=0&lt;BR /&gt;n=35000 The time was my dsptrd: 3962328 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=34000 The time was dsytrd Intel MKL: 7118641 ms. info=0&lt;BR /&gt;n=34000 The time was my dsptrd: 3633078 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=33000 The time was dsytrd Intel MKL: 6480797 ms. info=0&lt;BR /&gt;n=33000 The time was my dsptrd: 3323547 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=32000 The time was dsytrd Intel MKL: 5939719 ms. info=0&lt;BR /&gt;n=32000 The time was my dsptrd: 3030782 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=31000 The time was dsytrd Intel MKL: 5357406 ms. info=0&lt;BR /&gt;n=31000 The time was my dsptrd: 2755828 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=30000 The time was dsytrd Intel MKL: 4877547 ms. info=0&lt;BR /&gt;n=30000 The time was my dsptrd: 2498797 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=29000 The time was dsytrd Intel MKL: 4373578 ms. info=0&lt;BR /&gt;n=29000 The time was my dsptrd: 2257656 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=28000 The time was dsytrd Intel MKL: 3954922 ms. info=0&lt;BR /&gt;n=28000 The time was my dsptrd: 2032938 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=27000 The time was dsytrd Intel MKL: 3531782 ms. info=0&lt;BR /&gt;n=27000 The time was my dsptrd: 1823312 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=26000 The time was dsytrd Intel MKL: 3171781 ms. info=0&lt;BR /&gt;n=26000 The time was my dsptrd: 1628719 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=25000 The time was dsytrd Intel MKL: 2799281 ms. info=0&lt;BR /&gt;n=25000 The time was my dsptrd: 1448672 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=24000 The time was dsytrd Intel MKL: 2482109 ms. info=0&lt;BR /&gt;n=24000 The time was my dsptrd: 1282578 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=23000 The time was dsytrd Intel MKL: 2160562 ms. info=0&lt;BR /&gt;n=23000 The time was my dsptrd: 1129500 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=22000 The time was dsytrd Intel MKL: 1899421 ms. info=0&lt;BR /&gt;n=22000 The time was my dsptrd: 989203 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=21000 The time was dsytrd Intel MKL: 1645437 ms. info=0&lt;BR /&gt;n=21000 The time was my dsptrd: 861172 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=20000 The time was dsytrd Intel MKL: 1421594 ms. info=0&lt;BR /&gt;n=20000 The time was my dsptrd: 746547 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=19000 The time was dsytrd Intel MKL: 1209344 ms. info=0&lt;BR /&gt;n=19000 The time was my dsptrd: 638938 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=18000 The time was dsytrd Intel MKL: 1025391 ms. info=0&lt;BR /&gt;n=18000 The time was my dsptrd: 543937 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=17000 The time was dsytrd Intel MKL: 855171 ms. info=0&lt;BR /&gt;n=17000 The time was my dsptrd: 460234 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=16000 The time was dsytrd Intel MKL: 714203 ms. info=0&lt;BR /&gt;n=16000 The time was my dsptrd: 383219 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=15000 The time was dsytrd Intel MKL: 585125 ms. info=0&lt;BR /&gt;n=15000 The time was my dsptrd: 316203 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=14000 The time was dsytrd Intel MKL: 474891 ms. info=0&lt;BR /&gt;n=14000 The time was my dsptrd: 257609 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=13000 The time was dsytrd Intel MKL: 377844 ms. info=0&lt;BR /&gt;n=13000 The time was my dsptrd: 206703 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=12000 The time was dsytrd Intel MKL: 295094 ms. info=0&lt;BR /&gt;n=12000 The time was my dsptrd: 163015 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=11000 The time was dsytrd Intel MKL: 224157 ms. info=0&lt;BR /&gt;n=11000 The time was my dsptrd: 125969 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n=10000 The time was dsytrd Intel MKL: 168735 ms. info=0&lt;BR /&gt;n=10000 The time was my dsptrd: 94969 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 9000 The time was dsytrd Intel MKL: 122218 ms. info=0&lt;BR /&gt;n= 9000 The time was my dsptrd:69562 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 8000 The time was dsytrd Intel MKL: 86718 ms. info=0&lt;BR /&gt;n= 8000 The time was my dsptrd:49156 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 7000 The time was dsytrd Intel MKL: 58265 ms. info=0&lt;BR /&gt;n= 7000 The time was my dsptrd: 33125 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 6000 The time was dsytrd Intel MKL: 36968 ms. info=0&lt;BR /&gt;n= 6000 The time was my dsptrd: 21015 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 5000 The time was dsytrd Intel MKL: 22000 ms. info=0&lt;BR /&gt;n= 5000 The time was my dsptrd: 12265 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 4000 The time was dsytrd Intel MKL: 11671 ms. info=0&lt;BR /&gt;n= 4000 The time was my dsptrd: 6343 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 3000 The time was dsytrd Intel MKL: 5078 ms. info=0&lt;BR /&gt;n= 3000 The time was my dsptrd: 2672 ms. info=0&lt;BR /&gt; &lt;BR /&gt;n= 2000 The time was dsytrd Intel MKL: 1453 ms. info=0&lt;BR /&gt;n= 2000 The time was my dsptrd: 719 ms. info=0&lt;BR /&gt;&lt;BR /&gt;My web page (it is not currently available) and publications, which used my diagonalization, can be downloaded here: &lt;A href="http://depositfiles.com/files/fmy2ueaad"&gt;http://depositfiles.com/files/fmy2ueaad&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 07 Aug 2010 08:56:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767401#M337</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2010-08-07T08:56:19Z</dc:date>
    </item>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767402#M338</link>
      <description>Unless you have provided the source code or a linkable object code of your version to the Intel developers, before doing which you needed to protect your intellectual property, I don't think that such challenges have any interest.&lt;BR /&gt;&lt;BR /&gt;From the point of view of a user, the issues are:&lt;BR /&gt;&lt;BR /&gt; (i) Does the replacement candidate meet the specifications of the current library routine? In other words, does it do all the tasks that the current routine can do and support all the options presently available?&lt;BR /&gt;&lt;BR /&gt; (ii) How does the candidate stand in regard to stability and accuracy? Is the algorithm, if different, known and has it been peer-reviewed?&lt;BR /&gt;&lt;BR /&gt; (iii) Speed.&lt;BR /&gt;&lt;BR /&gt;Your post addresses only issue (iii). &lt;BR /&gt;&lt;BR /&gt;As to the question about packed versus full storage: full storage is programmer-friendly. The additional programming needed to use packed storage is not justified for one-off calls to routines such as DSPTRD when the time consumed is not considered important.</description>
      <pubDate>Sat, 07 Aug 2010 16:04:50 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767402#M338</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2010-08-07T16:04:50Z</dc:date>
    </item>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767403#M339</link>
      <description>The guys from Intel are well aware of my designs and my publications on this topic. They believe that they can own up to this task, but they can not do it. That they can be forgiven, because problem is very complicated. In fact, my program is faster, because dsptrd Intel MKL part-time view on two core, and this time the processor speeds up to frequencies of 3.47 GHz.</description>
      <pubDate>Sat, 07 Aug 2010 17:29:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767403#M339</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2010-08-07T17:29:33Z</dc:date>
    </item>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767404#M340</link>
      <description>&lt;I&gt;The guys from Intel are well aware of my designs and my publications on
this topic&lt;/I&gt;. &lt;BR /&gt;&lt;BR /&gt;I see.&lt;BR /&gt;&lt;BR /&gt;&lt;I&gt;They believe that they can own up to this task, but they
can not do it.&lt;/I&gt; &lt;BR /&gt;&lt;BR /&gt;You probably meant "They believe that they can do the task on their own, but ...". What you wrote means something quite different from this.&lt;BR /&gt;&lt;BR /&gt;&lt;I&gt;That they can be forgiven, because problem is very
complicated. In fact, my program is faster, because dsptrd Intel MKL
part-time view on two core, and this time the processor speeds up to
frequencies of 3.47 GHz.&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;I find that last sentence undeciperable, involving as it does a &lt;I&gt;non sequitur&lt;/I&gt;. Nor do I understand the phrase "part-time view on two core". Perhaps an online translation tool would help.</description>
      <pubDate>Sat, 07 Aug 2010 17:46:37 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767404#M340</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2010-08-07T17:46:37Z</dc:date>
    </item>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767405#M341</link>
      <description>&lt;I&gt;&amp;gt;They believe ...&lt;BR /&gt;&lt;/I&gt;This task is very difficult for them&lt;BR /&gt;&lt;BR /&gt;&amp;gt;part-time view on two core&lt;BR /&gt;Part of the problem is only on the two processor cores&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Sat, 07 Aug 2010 18:25:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767405#M341</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2010-08-07T18:25:46Z</dc:date>
    </item>
    <item>
      <title>Comparison of functions tridiagonalization dsptrd and dsytrd</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767406#M342</link>
      <description>Using the technology of the future the algorithm of super-fast matrix tridiagonalization is developed: &lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=76595"&gt;http://software.intel.com/en-us/forums/showthread.php?t=76595&lt;/A&gt;. This algorithm is much better than the fastest one for square matrices &lt;EM&gt;&lt;STRONG&gt;dsytrd&lt;/STRONG&gt;&lt;/EM&gt; Intel MKL (&amp;gt;24%, processor x7 860, XP x64, EM64T,the Intel MKL 10.3 Beta, HT off) :&lt;DIV class="almost_half_cell" id="gt-res-content"&gt;&lt;DIV id="translit" dir="ltr"&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV id="dict"&gt;&lt;P&gt;n=12000 The time was dsytrd Intel MKL: 203860 ms. info=0&lt;BR /&gt;n=12000 The time was my dsptrd : 164391 ms. info=0&lt;/P&gt;&lt;P&gt;n=13000 The time was dsytrd Intel MKL: 259016 ms. info=0&lt;BR /&gt;n=13000 The time was my dsptrd: 208703 ms. info=0&lt;/P&gt;&lt;P&gt;n=14000 The time was dsytrd Intel MKL: 322344 ms. info=0&lt;BR /&gt;n=14000 The time was my dsptrd: 259828 ms. info=0&lt;/P&gt;&lt;P&gt;n=15000 The time was dsytrd Intel MKL: 396094 ms. info=0&lt;BR /&gt;n=15000 The time was my dsptrd: 318688 ms. info=0&lt;/P&gt;&lt;P&gt;n=16000 The time was dsytrd Intel MKL: 480797 ms. info=0&lt;BR /&gt;n=16000 The time was my dsptrd: 386532 ms. info=0&lt;/P&gt;&lt;P&gt;n=17000 The time was dsytrd Intel MKL: 574360 ms. info=0&lt;BR /&gt;n=17000 The time was my dsptrd: 462375 ms. info=0&lt;/P&gt;&lt;P&gt;n=18000 The time was dsytrd Intel MKL: 681281 ms. info=0&lt;BR /&gt;n=18000 The time was my dsptrd: 548657 ms. info=0&lt;/P&gt;&lt;P&gt;n=19000 The time was dsytrd Intel MKL: 801937 ms. info=0&lt;BR /&gt;n=19000 The time was my dsptrd: 644031 ms. info=0&lt;/P&gt;&lt;P&gt;n=20000 The time was dsytrd Intel MKL: 933172 ms. info=0&lt;BR /&gt;n=20000 The time was my dsptrd: 750235 ms. info=0&lt;/P&gt;&lt;P&gt;n=26000 The time was dsytrd Intel MKL: 2041297 ms. info=0&lt;BR /&gt;n=26000 The time was my dsptrd: 1640625 ms. info=0&lt;BR /&gt;&lt;BR /&gt;If these results are combined with &lt;A href="http://software.intel.com/en-us/forums/showthread.php?t=73653&amp;amp;o=d&amp;amp;s=lr"&gt;http://software.intel.com/en-us/forums/showthread.php?t=73653&amp;amp;o=d&amp;amp;s=lr&lt;/A&gt;, the gap in the rate calculation will be very large.&lt;/P&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 11 Aug 2010 13:01:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767406#M342</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2010-08-11T13:01:26Z</dc:date>
    </item>
    <item>
      <title>Regarding the speed of the program dsptrd Intel MKL</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767407#M343</link>
      <description>&lt;DIV id="tiny_quote"&gt;&lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A jquery1281677187609="63" rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=312233" href="https://community.intel.com/en-us/profile/312233/" class="basic"&gt;yuriisig&lt;/A&gt;&lt;/DIV&gt;&lt;DIV style="background-color: #e5e5e5; margin-left: 2px; margin-right: 2px; border: 1px inset; padding: 5px;"&gt;&lt;I&gt;part of the problem is only on the two processor cores&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;If you take a large matrix (eg, 40000 * 40000), the Task Manager shows that more than 3 / 4 time &lt;STRONG&gt;&lt;EM&gt;dsptrd&lt;/EM&gt;&lt;/STRONG&gt; Intel MKL believes only one processor core. It turns out that my &lt;STRONG&gt;&lt;EM&gt;dsptrd&lt;/EM&gt;&lt;/STRONG&gt; faster at 115%.&lt;/P&gt;</description>
      <pubDate>Fri, 13 Aug 2010 05:35:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767407#M343</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2010-08-13T05:35:40Z</dc:date>
    </item>
    <item>
      <title>Here is an example for the</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767408#M344</link>
      <description>&lt;P&gt;Here is an example for the new Intel processors (Intel(R) Core(TM) i7-5820K &lt;A href="mailto:CPU@3.30"&gt;CPU@3.30&lt;/A&gt; GHz) with six cores (&lt;SPAN&gt;parallel_studio_xe_2017_update4&lt;/SPAN&gt;):&lt;/P&gt;

&lt;P&gt;n=10000 The time was the Intel MKL dsptrd: 100.5 s.&lt;BR /&gt;
	n=10000 The time was the Intel MKL dsytrd: 41.5 s.&lt;/P&gt;

&lt;P&gt;But my dsptrd faster than the Intel MKL dsytrd!!! i.e. Rectangular Full Packed (RFP) storage scheme proposed by Intel that allows you to work with the matrix using dgemm and dtrmm is not optimal!!!&lt;/P&gt;

&lt;P&gt;Is not resolved also other issues: for example: &lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/290238"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/290238&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jul 2017 01:53:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767408#M344</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2017-07-02T01:53:00Z</dc:date>
    </item>
    <item>
      <title>Yuri, accordingly perf</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767409#M345</link>
      <description>&lt;P&gt;Yuri, accordingly perf results you provided, your implementation significantly faster then MKL for some problem sizes. The problem is how we may check these results? Could we take the evaluation version of your library?&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jul 2017 02:46:16 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767409#M345</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2017-07-02T02:46:16Z</dc:date>
    </item>
    <item>
      <title>Genadiy, I already published</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767410#M346</link>
      <description>&lt;P&gt;&lt;SPAN class="short_text" id="result_box" lang="en"&gt;&lt;SPAN&gt;Genadiy, I already published my algorithms: &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/287728"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/287728&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN id="result_box" lang="en"&gt;&lt;SPAN&gt;Check my algorithms very simply: send me a representative.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jul 2017 03:11:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767410#M346</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2017-07-02T03:11:12Z</dc:date>
    </item>
    <item>
      <title>Yurii, the two Depositfile</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767411#M347</link>
      <description>&lt;P&gt;Yurii, the two Depositfile links (in the forum post that you cited in #10) are dead: if either is chosen, after a 30-second wait we see the note:&lt;/P&gt;

&lt;BLOCKQUOTE&gt;
	&lt;P&gt;This file does not exist, the access to the following file is limits or it has been removed due to infringement of copyright.&lt;SPAN style="white-space:pre"&gt; &lt;/SPAN&gt;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;

&lt;P&gt;&lt;SPAN style="white-space: pre;"&gt;So, where does one go (in year 2017) to learn the main points of your algorithm?&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jul 2017 11:57:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767411#M347</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2017-07-02T11:57:04Z</dc:date>
    </item>
    <item>
      <title>According to the lead post in</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767412#M348</link>
      <description>&lt;P&gt;According to the lead post in this thread, you are judging by the total time of all threads, while mkl should be optimized for minimum elapsed time, possibly varying number of active cores. You would need to experiment with number of threads and affinity to reduce total cpu time.&lt;/P&gt;</description>
      <pubDate>Sun, 02 Jul 2017 13:03:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767412#M348</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2017-07-02T13:03:27Z</dc:date>
    </item>
    <item>
      <title>Tim,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767413#M349</link>
      <description>&lt;P&gt;Tim,&lt;/P&gt;

&lt;P&gt;&lt;SPAN id="result_box" lang="en"&gt;&lt;SPAN&gt;First of all, I'm talking about the difference between my algorithms and Intel MKL algorithms: my algorithms are much faster.&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="en"&gt;&lt;SPAN&gt;--Yurii&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Jul 2017 17:04:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767413#M349</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2017-07-03T17:04:19Z</dc:date>
    </item>
    <item>
      <title>mecej4,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767414#M350</link>
      <description>&lt;P&gt;mecej4,&lt;/P&gt;

&lt;P&gt;&lt;SPAN id="result_box" lang="en"&gt;&lt;SPAN&gt;I published my algorithm only for the dormtr function: &lt;/SPAN&gt;&lt;/SPAN&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/287728"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/287728&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN lang="en"&gt;&lt;SPAN&gt;--Yurii&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 03 Jul 2017 17:27:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Regarding-the-speed-of-the-program-dsptrd-Intel-MKL/m-p/767414#M350</guid>
      <dc:creator>yuriisig</dc:creator>
      <dc:date>2017-07-03T17:27:14Z</dc:date>
    </item>
  </channel>
</rss>

