<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hello, in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173955#M28751</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;As Gennady said, it&amp;nbsp;is a closed information. A general answer is yes, we try to do different (incl. hardware-aware) optimizations on different levels (multi-tthreading, vectorization and assembly) in order to achieve better performance, as we do in general in MKL.&lt;/P&gt;&lt;P&gt;You can find articles about different algorithms which can be used for sparse-dense matrix multiplication.&lt;/P&gt;&lt;P&gt;The sparsity of one of the inputs implies that some optimization techniques used for dense gemm make much less or no sense at all&amp;nbsp;and at the same time, some additional ideas are used. It also helps to think whether the functionality is compute or memory bound.&lt;/P&gt;&lt;P&gt;Best,&lt;BR /&gt;Kirill&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 02 May 2020 23:17:11 GMT</pubDate>
    <dc:creator>Kirill_V_Intel</dc:creator>
    <dc:date>2020-05-02T23:17:11Z</dc:date>
    <item>
      <title>Sparse-Dense Matrix Multiplication</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173953#M28749</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I am working with the &lt;EM&gt;mkl_sparse_s_mm &lt;/EM&gt;routine to perform:&lt;/P&gt;&lt;P&gt;C = A * B&lt;/P&gt;&lt;P&gt;where A is a sparse matrix and C,B are dense matrices. I would like to have some details about the algorithm for sparse-dense matrix multiplication implemented by &lt;EM&gt;mkl_sparse_s_mm, &lt;/EM&gt;&lt;EM&gt;i.e., &lt;/EM&gt;if it uses cache aware strategies, specific micro-kernel implementations to fully leverage CPU registers etc..&lt;/P&gt;&lt;P&gt;Just to provide an example, high performance dense matrix multiplication (&lt;EM&gt;cblas_gemm) &lt;/EM&gt;is usually implemented following the &lt;A href="https://www.google.com/url?sa=t&amp;amp;rct=j&amp;amp;q=&amp;amp;esrc=s&amp;amp;source=web&amp;amp;cd=1&amp;amp;cad=rja&amp;amp;uact=8&amp;amp;ved=2ahUKEwiR5sy_uovpAhW7RhUIHbOiAToQFjAAegQIBRAB&amp;amp;url=https%3A%2F%2Fwww.cs.utexas.edu%2Fusers%2Fpingali%2FCS378%2F2008sp%2Fpapers%2FgotoPaper.pdf&amp;amp;usg=AOvVaw3LlHP6NFon0yb9HTUyh99A"&gt;block pack algorithm. &lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thank you&lt;/P&gt;&lt;P&gt;Cosimo Rullli&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 28 Apr 2020 15:44:29 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173953#M28749</guid>
      <dc:creator>Rulli__Cosimo</dc:creator>
      <dc:date>2020-04-28T15:44:29Z</dc:date>
    </item>
    <item>
      <title>it the propriety info, but we</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173954#M28750</link>
      <description>&lt;P&gt;it is the proprietary info, but we will ask sparse blas developers to look at this thread and they probably will share something, but I am not sure.&lt;/P&gt;</description>
      <pubDate>Wed, 29 Apr 2020 04:51:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173954#M28750</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2020-04-29T04:51:00Z</dc:date>
    </item>
    <item>
      <title>Hello,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173955#M28751</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;As Gennady said, it&amp;nbsp;is a closed information. A general answer is yes, we try to do different (incl. hardware-aware) optimizations on different levels (multi-tthreading, vectorization and assembly) in order to achieve better performance, as we do in general in MKL.&lt;/P&gt;&lt;P&gt;You can find articles about different algorithms which can be used for sparse-dense matrix multiplication.&lt;/P&gt;&lt;P&gt;The sparsity of one of the inputs implies that some optimization techniques used for dense gemm make much less or no sense at all&amp;nbsp;and at the same time, some additional ideas are used. It also helps to think whether the functionality is compute or memory bound.&lt;/P&gt;&lt;P&gt;Best,&lt;BR /&gt;Kirill&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 02 May 2020 23:17:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Dense-Matrix-Multiplication/m-p/1173955#M28751</guid>
      <dc:creator>Kirill_V_Intel</dc:creator>
      <dc:date>2020-05-02T23:17:11Z</dc:date>
    </item>
  </channel>
</rss>

