<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Sparse blas Matrix-vector vs simple implementation in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-blas-Matrix-vector-vs-simple-implementation/m-p/911927#M12206</link>
    <description>Hello,&lt;BR /&gt;I'm testing performance of mkl 9.1 on AMD Athlon 2200 processor.&lt;BR /&gt;I'm comparing the multiplication of sparse matrix to a vector.&lt;BR /&gt;When i compare &lt;BR /&gt;&lt;BR /&gt;mkl_dcsrgemv&lt;BR /&gt;&lt;BR /&gt;against &lt;BR /&gt;my simple function shown below&lt;BR /&gt;&lt;BR /&gt;...&lt;BR /&gt; for (int i = 0; i &amp;lt; n; i++)&lt;BR /&gt; {&lt;BR /&gt; x&lt;I&gt; = 0;&lt;BR /&gt;&lt;BR /&gt; for (int j = ia&lt;I&gt;; j &amp;lt; ia[i+1]; j++)&lt;BR /&gt; {&lt;BR /&gt; int col = ja[j-1] - 1;&lt;BR /&gt; x&lt;I&gt; += v[col] * a[j-1];&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt;...&lt;BR /&gt;&lt;BR /&gt;I'm not getting any significant speed up after my code is compiled with optimization flag in gcc (-O2). I'm seeing only ~2-3% speedup.&lt;BR /&gt;&lt;BR /&gt;Is it a normal behavior or should i be expecting a lot more speedup by using sparse-blas routines?&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
    <pubDate>Mon, 27 Aug 2007 19:10:10 GMT</pubDate>
    <dc:creator>tae</dc:creator>
    <dc:date>2007-08-27T19:10:10Z</dc:date>
    <item>
      <title>Sparse blas Matrix-vector vs simple implementation</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-blas-Matrix-vector-vs-simple-implementation/m-p/911927#M12206</link>
      <description>Hello,&lt;BR /&gt;I'm testing performance of mkl 9.1 on AMD Athlon 2200 processor.&lt;BR /&gt;I'm comparing the multiplication of sparse matrix to a vector.&lt;BR /&gt;When i compare &lt;BR /&gt;&lt;BR /&gt;mkl_dcsrgemv&lt;BR /&gt;&lt;BR /&gt;against &lt;BR /&gt;my simple function shown below&lt;BR /&gt;&lt;BR /&gt;...&lt;BR /&gt; for (int i = 0; i &amp;lt; n; i++)&lt;BR /&gt; {&lt;BR /&gt; x&lt;I&gt; = 0;&lt;BR /&gt;&lt;BR /&gt; for (int j = ia&lt;I&gt;; j &amp;lt; ia[i+1]; j++)&lt;BR /&gt; {&lt;BR /&gt; int col = ja[j-1] - 1;&lt;BR /&gt; x&lt;I&gt; += v[col] * a[j-1];&lt;BR /&gt; }&lt;BR /&gt; }&lt;BR /&gt;...&lt;BR /&gt;&lt;BR /&gt;I'm not getting any significant speed up after my code is compiled with optimization flag in gcc (-O2). I'm seeing only ~2-3% speedup.&lt;BR /&gt;&lt;BR /&gt;Is it a normal behavior or should i be expecting a lot more speedup by using sparse-blas routines?&lt;BR /&gt;&lt;BR /&gt;Thanks&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Mon, 27 Aug 2007 19:10:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-blas-Matrix-vector-vs-simple-implementation/m-p/911927#M12206</guid>
      <dc:creator>tae</dc:creator>
      <dc:date>2007-08-27T19:10:10Z</dc:date>
    </item>
    <item>
      <title>Re: Sparse blas Matrix-vector vs simple implementation</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-blas-Matrix-vector-vs-simple-implementation/m-p/911928#M12207</link>
      <description>&lt;P&gt;Hello&lt;/P&gt;
&lt;P&gt;The performance of the routine mentioned by youdepends on the structure of the inputsparse matrix since the distribution of the nonzero elements in a sparse matrix determines the memory access patterns. So the performance greatly depends on input sparse matrix as well as on the its dimension. &lt;/P&gt;
&lt;P&gt;Probably the numbers reported by you are normal. I need to look at the input data. &lt;/P&gt;
&lt;P&gt;By the way the routine is OpenMP parallelized. Have you tested it in parallel mode by setting OMP_NUM_THREADS environment variable?&lt;/P&gt;
&lt;P&gt;All the best&lt;/P&gt;
&lt;P&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Tue, 18 Sep 2007 09:33:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-blas-Matrix-vector-vs-simple-implementation/m-p/911928#M12207</guid>
      <dc:creator>Sergey_K_Intel1</dc:creator>
      <dc:date>2007-09-18T09:33:04Z</dc:date>
    </item>
  </channel>
</rss>

