<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Performance Evaluation of a Matrix Multiply: 2048 x 2048 \\ Data type 'float' \\ All matrix elements 1.0f in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773822#M851</link>
    <description>&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;NOTE:&lt;/SPAN&gt;&lt;/STRONG&gt; I'msorry, but I decided to post again becausemy previous post became "deviated" from the subject.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Guys, if you have some time andcouldprovide some performancenumbers, obtained with any&lt;BR /&gt;version of MKL,I really appreciate it! If you can't... sorry that my post took a couple of seconds&lt;BR /&gt;of your valuabletime.&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;THIS IS WHAT I NEED:&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;I wonder if somebody, who has an&lt;STRONG&gt;MKL&lt;/STRONG&gt;, could do a Performance Evaluation of aMatrix Multiplication function?&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Test-Case:&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; - &lt;STRONG&gt;Both matrices2048 x 2048&lt;/STRONG&gt;&lt;BR /&gt; - &lt;STRONG&gt;Data type 'float'&lt;/STRONG&gt;&lt;BR /&gt; - &lt;STRONG&gt;All Elements Initialized to 1.0f&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;Please report a&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;Time ( &lt;/STRONG&gt;in secs&lt;STRONG&gt; )to Calculate&lt;/STRONG&gt;&lt;/SPAN&gt; aProduct of two matricesand somedetailsabout your CPU,&lt;BR /&gt;frequency, memory in GBs, etc.&lt;BR /&gt;&lt;BR /&gt;I'm &lt;STRONG&gt;not&lt;/STRONG&gt; interested in aresult of multiplication. I'm interested to know &lt;STRONG&gt;how longit takes to calculate&lt;/STRONG&gt; it on&lt;BR /&gt;different computers with different CPUs using Intel's&lt;STRONG&gt;MKL&lt;/STRONG&gt;.&lt;BR /&gt;&lt;BR /&gt;Thank you in advance.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
    <pubDate>Sat, 17 Dec 2011 01:17:11 GMT</pubDate>
    <dc:creator>SergeyKostrov</dc:creator>
    <dc:date>2011-12-17T01:17:11Z</dc:date>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \\ Data type 'float' \\ All matrix elements 1.0f</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773822#M851</link>
      <description>&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;NOTE:&lt;/SPAN&gt;&lt;/STRONG&gt; I'msorry, but I decided to post again becausemy previous post became "deviated" from the subject.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Guys, if you have some time andcouldprovide some performancenumbers, obtained with any&lt;BR /&gt;version of MKL,I really appreciate it! If you can't... sorry that my post took a couple of seconds&lt;BR /&gt;of your valuabletime.&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;THIS IS WHAT I NEED:&lt;/SPAN&gt;&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;I wonder if somebody, who has an&lt;STRONG&gt;MKL&lt;/STRONG&gt;, could do a Performance Evaluation of aMatrix Multiplication function?&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;Test-Case:&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; - &lt;STRONG&gt;Both matrices2048 x 2048&lt;/STRONG&gt;&lt;BR /&gt; - &lt;STRONG&gt;Data type 'float'&lt;/STRONG&gt;&lt;BR /&gt; - &lt;STRONG&gt;All Elements Initialized to 1.0f&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt;Please report a&lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;Time ( &lt;/STRONG&gt;in secs&lt;STRONG&gt; )to Calculate&lt;/STRONG&gt;&lt;/SPAN&gt; aProduct of two matricesand somedetailsabout your CPU,&lt;BR /&gt;frequency, memory in GBs, etc.&lt;BR /&gt;&lt;BR /&gt;I'm &lt;STRONG&gt;not&lt;/STRONG&gt; interested in aresult of multiplication. I'm interested to know &lt;STRONG&gt;how longit takes to calculate&lt;/STRONG&gt; it on&lt;BR /&gt;different computers with different CPUs using Intel's&lt;STRONG&gt;MKL&lt;/STRONG&gt;.&lt;BR /&gt;&lt;BR /&gt;Thank you in advance.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Sat, 17 Dec 2011 01:17:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773822#M851</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-17T01:17:11Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773823#M852</link>
      <description>Sergey,&lt;BR /&gt;&lt;BR /&gt;We post a number of benchmarks on our &lt;A href="http://software.intel.com/en-us/articles/intel-mkl/#details"&gt;website&lt;/A&gt; but we don't expect that it will ever cover all customer questions. There are simply too many permutations. &lt;BR /&gt;&lt;BR /&gt;Even your question above, leads to some other question... What OS? Are matrices transposed or not? You say both matrices, so is the third matrix in SGEMM, "C" zeroed with beta equal to 0? &lt;BR /&gt;&lt;BR /&gt;And then naturally, there will be required full documentation and disclaimers when Intel posts some benchmark number.&lt;BR /&gt;&lt;BR /&gt;So you see, what seems like a simple request can become a slightly bigger request, so we do our best here to provide some representative performance numbers that give an indication of the kinds of results you can get with Intel MKL and then for the other cases we provide a &lt;A href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/"&gt;free evaluation &lt;/A&gt;copy of the fully functional version of Intel MKL so that you can give it a try on the case that is important to you. &lt;BR /&gt;&lt;BR /&gt;Todd&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Tue, 20 Dec 2011 17:34:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773823#M852</guid>
      <dc:creator>Todd_R_Intel</dc:creator>
      <dc:date>2011-12-20T17:34:46Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773824#M853</link>
      <description>&lt;STRONG&gt;&amp;gt;&amp;gt;We post a number of benchmarks on our &lt;/STRONG&gt;&lt;A href="http://software.intel.com/en-us/articles/intel-mkl/#details"&gt;&lt;STRONG&gt;website&lt;/STRONG&gt;&lt;/A&gt;&lt;STRONG&gt; but we don't expect...&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; These benchmarks are in '&lt;STRONG&gt;Gflops&lt;/STRONG&gt;', not in'&lt;STRONG&gt;Seconds&lt;/STRONG&gt;'.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;gt;&amp;gt;Even your question above, leads to some other question... What OS?&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; Any OS. No special requirements andwhatever is best for you. A computer with a latest or&lt;BR /&gt; older ( 1 - 2 year old )IntelCPU would be OK.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;gt;&amp;gt;Are matrices transposed or not?&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; No. All matrix elements are initialized to 1.0. Both matrices are square, 2048 by 2048, it means that&lt;BR /&gt; it doesn't matter if you transposesome matrixor not. It will be the same.&lt;BR /&gt;&lt;BR /&gt;&lt;STRONG&gt;&amp;gt;&amp;gt;You say both matrices, so is the third matrix in SGEMM, "C" zeroed with beta equal to 0?&lt;/STRONG&gt;&lt;BR /&gt;&lt;BR /&gt; Here is a C-pseudo code:&lt;BR /&gt;&lt;BR /&gt; ...&lt;BR /&gt; float fA[2048][2048];// Matrix A&lt;BR /&gt; float fB[2048][2048]; // Matrix B&lt;BR /&gt; float fC[2048][2048]; // Matrix C&lt;BR /&gt;&lt;BR /&gt; for( int i=0; i&amp;lt;2048; i++)&lt;BR /&gt; {&lt;BR /&gt;  for( int j=0; j&amp;lt;2048; j++ )&lt;BR /&gt; {&lt;BR /&gt; fA&lt;I&gt;&lt;J&gt;=1.0f;&lt;BR /&gt;  fB&lt;I&gt;&lt;J&gt;=1.0f;&lt;BR /&gt;  fC&lt;I&gt;&lt;J&gt;=0.0f;&lt;BR /&gt;  }&lt;BR /&gt; }&lt;BR /&gt;&lt;BR /&gt; t1 = GetTime();&lt;BR /&gt; fC = &amp;lt; &lt;SPAN style="text-decoration: underline;"&gt;&lt;STRONG&gt;MKLMatrixMultiply&lt;/STRONG&gt;&lt;/SPAN&gt; &amp;gt;( fA, fB );// Any MKL version&lt;BR /&gt; t2 = GetTime();&lt;BR /&gt;&lt;BR /&gt; Delta = t2 - t1; // Time to multiply (in seconds, for example )&lt;BR /&gt; ...&lt;BR /&gt;&lt;BR /&gt;As you can see &lt;SPAN style="text-decoration: underline;"&gt;I don't need something really special&lt;/SPAN&gt;.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;&lt;/J&gt;&lt;/I&gt;&lt;/J&gt;&lt;/I&gt;&lt;/J&gt;&lt;/I&gt;</description>
      <pubDate>Wed, 21 Dec 2011 00:53:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773824#M853</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-21T00:53:14Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773825#M854</link>
      <description>&lt;P&gt;Hi Sergey,&lt;BR /&gt;&lt;BR /&gt;We report the performance numbersin flops (&lt;STRONG&gt;flop/sec&lt;/STRONG&gt;), which is the number offloating point operations(&lt;STRONG&gt;flop&lt;/STRONG&gt;)per second (&lt;STRONG&gt;sec&lt;/STRONG&gt;). You can find the time required for a routine if you know &lt;STRONG&gt;flop &lt;/STRONG&gt;and &lt;STRONG&gt;flop/sec. &lt;/STRONG&gt;&lt;B&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/B&gt;For example, the number of floating point operations to compute SGEMM with M=N=K=2048,beta=0.0, alpha=1.0is given as:&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;2*M*N*K&lt;/EM&gt;= &lt;EM&gt;2*2048*2048*2048&lt;/EM&gt; = &lt;EM&gt;17179869184&lt;/EM&gt; flop ~=&lt;EM&gt; 17.180&lt;/EM&gt; Giga-Flop (GFlop)&lt;BR /&gt;&lt;BR /&gt;Now, if SGEMM runs at 200 GFlop/sec (or GFlops), then the time for SGEMM will be:&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;17.180 / 200&lt;/EM&gt; = &lt;EM&gt;0.0859&lt;/EM&gt; secs&lt;BR /&gt;&lt;BR /&gt;Double-precision GEMM (DGEMM) is shown on the performance charts, and as a rule-of-thumb, the single-precision performance is two times of the double-precision performance. Therefore, you can multiply the DGEMM GFlops by two to get an estimate of SGEMM GFlops.&lt;/P&gt;&lt;P&gt;Best wishes,&lt;/P&gt;&lt;P&gt;Efe&lt;/P&gt;</description>
      <pubDate>Wed, 21 Dec 2011 21:13:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773825#M854</guid>
      <dc:creator>Murat_G_Intel</dc:creator>
      <dc:date>2011-12-21T21:13:21Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773826#M855</link>
      <description>Hi Efe,&lt;BR /&gt;&lt;BR /&gt;Even if it issome kind of "calculated performance", &lt;SPAN style="text-decoration: underline;"&gt;not&lt;/SPAN&gt; measured,it gives me better ideaabout performance of MKL.&lt;BR /&gt;&lt;BR /&gt;I have a question. What is a number '&lt;STRONG&gt;2&lt;/STRONG&gt;' in:&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;2&lt;/SPAN&gt;&lt;/STRONG&gt;*M*N*K&lt;/EM&gt;= &lt;EM&gt;&lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;2&lt;/SPAN&gt;&lt;/STRONG&gt;*2048*2048*2048&lt;/EM&gt; = &lt;EM&gt;17179869184&lt;/EM&gt; flop ~=&lt;EM&gt; 17.180&lt;/EM&gt; Giga-Flop (GFlop)&lt;BR /&gt;^&lt;BR /&gt;&lt;BR /&gt;Thank you for your time!&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Thu, 22 Dec 2011 00:43:21 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773826#M855</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-22T00:43:21Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773827#M856</link>
      <description>&lt;DIV&gt;this is the number of multiplications and additions.&lt;/DIV&gt;</description>
      <pubDate>Thu, 22 Dec 2011 05:38:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773827#M856</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2011-12-22T05:38:06Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773828#M857</link>
      <description>&amp;gt;&amp;gt;...Now, if SGEMM runs at &lt;STRONG&gt;&lt;SPAN style="text-decoration: underline;"&gt;200 GFlop/sec&lt;/SPAN&gt;&lt;/STRONG&gt; (or GFlops )&lt;BR /&gt;&lt;BR /&gt; &lt;SPAN style="text-decoration: underline;"&gt;Question1:&lt;/SPAN&gt;&lt;BR /&gt; What modernIntel's CPUs provide such performance?&lt;BR /&gt;&lt;BR /&gt; &lt;SPAN style="text-decoration: underline;"&gt;Question 2:&lt;/SPAN&gt;&lt;BR /&gt; I also would like to compare performance gainsrelative tosome older Intel CPUs, for example&lt;BR /&gt; &lt;STRONG&gt;Pentium 4&lt;/STRONG&gt; or &lt;STRONG&gt;Atom N270&lt;/STRONG&gt;. So, how fast are they in terms of number of floating point operations in a second?&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Thu, 22 Dec 2011 14:23:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773828#M857</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-22T14:23:39Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773829#M858</link>
      <description>&lt;DIV id="tiny_quote"&gt;
                &lt;DIV style="margin-left: 2px; margin-right: 2px;"&gt;Quoting &lt;A rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=353541" class="basic" href="https://community.intel.com/en-us/profile/353541/"&gt;Sergey Kostrov&lt;/A&gt;&lt;/DIV&gt;
                &lt;DIV style="background-color: #e5e5e5; padding: 5px; border: 1px; border-style: inset; margin-left: 2px; margin-right: 2px;"&gt;&lt;I&gt;&amp;gt;&amp;gt;...Now, if SGEMM runs at &lt;B&gt;&lt;SPAN style="text-decoration: underline;"&gt;200 GFlop/sec&lt;/SPAN&gt;&lt;/B&gt; (or GFlops )&lt;BR /&gt;&lt;BR /&gt; &lt;SPAN style="text-decoration: underline;"&gt;Question1:&lt;/SPAN&gt;&lt;BR /&gt; What modernIntel's CPUs provide such performance?&lt;BR /&gt;&lt;BR /&gt; &lt;SPAN style="text-decoration: underline;"&gt;Question 2:&lt;/SPAN&gt;&lt;BR /&gt; I also would like to compare performance gainsrelative tosome older Intel CPUs, for example&lt;BR /&gt; &lt;B&gt;Pentium 4&lt;/B&gt; or &lt;B&gt;Atom N270&lt;/B&gt;. So, how fast are they in terms of number of floating point operations in a second?&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;P&gt;&lt;/P&gt;An AVX CPU, even without fma, would have a peak rating of 16 single precision flop per core per Hz clock speed. So you are talking about e.g. an 8 core CPU at 2Ghz.&lt;BR /&gt;Most of the recent new entries on Top500 are exceeding 200 Gflops DGEMM per node (2 CPUs) and 80% "efficiency" (actual vs. peak rated performance), and that is sustained for over 10000 cores. &lt;BR /&gt;This (for P4, Atom), .... has been covered many times over in public internet posts.</description>
      <pubDate>Thu, 22 Dec 2011 20:32:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773829#M858</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2011-12-22T20:32:33Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773830#M859</link>
      <description>&amp;gt;&amp;gt;...&lt;EM&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;*M*N*K&lt;/EM&gt;= &lt;EM&gt;&lt;STRONG&gt;2&lt;/STRONG&gt;*2048*2048*2048&lt;BR /&gt;&lt;/EM&gt;&lt;BR /&gt;It looks likeafamous &lt;STRONG&gt;T&lt;/STRONG&gt;=&lt;STRONG&gt;O&lt;/STRONG&gt;*(&lt;STRONG&gt;n&lt;/STRONG&gt;^3) and &lt;STRONG&gt;O&lt;/STRONG&gt; equals to '&lt;STRONG&gt;2&lt;/STRONG&gt;'.&lt;BR /&gt;&lt;BR /&gt;I'm not convinced that a classic (&lt;SPAN style="text-decoration: underline;"&gt;single-thread&lt;/SPAN&gt;) algorithm for matrix multiplication is at the core of MLK's&lt;BR /&gt;&lt;STRONG&gt;SGEMM&lt;/STRONG&gt; or &lt;STRONG&gt;DGEMM&lt;/STRONG&gt; functions. I think Strassen or Strassen-Winograd algorithmshave to be used to boost a&lt;BR /&gt;speed ofcalculations.&lt;BR /&gt;</description>
      <pubDate>Fri, 23 Dec 2011 01:11:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773830#M859</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-23T01:11:10Z</dc:date>
    </item>
    <item>
      <title>Performance Evaluation of a Matrix Multiply: 2048 x 2048 \ Data</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773831#M860</link>
      <description>Merry Christmas and a Happy New Year!&lt;BR /&gt;&lt;BR /&gt;Thanks to everybody who responded to my posts.&lt;BR /&gt;&lt;BR /&gt;Best regards,&lt;BR /&gt;Sergey&lt;BR /&gt;</description>
      <pubDate>Sat, 24 Dec 2011 16:56:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Performance-Evaluation-of-a-Matrix-Multiply-2048-x-2048-Data/m-p/773831#M860</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2011-12-24T16:56:40Z</dc:date>
    </item>
  </channel>
</rss>

