<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017822#M19595</link>
    <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;Gennady&lt;/P&gt;

&lt;P&gt;I have tested the code with matrices 1070*1000000 (nzmax = 85534075) multiplied by a matrix 1000000*1070 (nzmax = 85534075) on (I do not know generation info for them):&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;2.8 GHz Intel Core i7, 16GB RAM&lt;/LI&gt;
	&lt;LI&gt;3.4 GHz Intel Core i7, 16GB RAM&lt;/LI&gt;
	&lt;LI&gt;2.80 Ghz Intel (RR) Xeon, 96 GB RAM&lt;/LI&gt;
	&lt;LI&gt;2.4 Ghz Intel i7, 8GB RAM.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;You can test the code for your own matrices. &amp;nbsp;I have to refactor the code .&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Vineet&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Dec 2014 17:55:09 GMT</pubDate>
    <dc:creator>Vineet_Y_</dc:creator>
    <dc:date>2014-12-18T17:55:09Z</dc:date>
    <item>
      <title>Sparse-Sparse Matrix Multiplication</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017818#M19591</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I have used mkl_dcsrmultcsr in my research. However, it is performing double pass to compute sparse*sparse matrix product. For small size problems, this is not a problem, however for large size problems (e.g. matrices of size ½ billion by ½ billion) this is time consuming and it would be better if MKL can do this multiplication in a single pass in a parallel setup.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Most (if not all) of the sparse*sparse CSR matrix multiplications algorithm use Gustavson Algorithm (ACM 1978) and there is no reason why this algorithm cannot be parallelized and do calculations in a single pass. I understand that the performance of a single pass parallelization would depend on pre-allocating the space non-zero values, which I think can be reasonably given in most situations and even if this does not work the algorithm should be able to adjust the buffer size (if required). &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Similarly, it would be useful to only compute lower/upper triangular portion of the output matrix (of course the output matrix have to be symmetric).&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Application domain: Statistics, PDE’s, Inverse Problems, Weather Prediction.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Vineet&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 09 Dec 2014 19:28:13 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017818#M19591</guid>
      <dc:creator>Vineet_Y_</dc:creator>
      <dc:date>2014-12-09T19:28:13Z</dc:date>
    </item>
    <item>
      <title>Vineet,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017819#M19592</link>
      <description>&lt;P&gt;Vineet,&lt;/P&gt;

&lt;P&gt;the current API doesn't allow to do that but we are thinking about such option into the next API we are working on right now.&lt;/P&gt;

&lt;P&gt;--Gennady&lt;/P&gt;</description>
      <pubDate>Fri, 12 Dec 2014 08:22:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017819#M19592</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2014-12-12T08:22:45Z</dc:date>
    </item>
    <item>
      <title>Hi</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017820#M19593</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;Gennady&lt;/P&gt;

&lt;P&gt;After spending considerable amount of time last week I was able to come up with my own single pass parallel solution which is ~ 30 to 35% faster than mkl_dcsrmultcsr. It can also work in a hybrid setup (MPI+Openmp) and gives the output matrix in a sorted form. Any help (if possible ; no compulsions) in optimizing the attached code would be greatly appreciated as for large matrices it can save days worth of work.&lt;/P&gt;

&lt;P&gt;Vineet&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Dec 2014 00:07:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017820#M19593</guid>
      <dc:creator>Vineet_Y_</dc:creator>
      <dc:date>2014-12-18T00:07:51Z</dc:date>
    </item>
    <item>
      <title>30-35% of speedup -  thanks</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017821#M19594</link>
      <description>&lt;P&gt;30-35% of speedup - &amp;nbsp;thanks for sharing this. Are these any specific input matrixes? what is the typical size, nnz and type of this matrices? what is the CPU you are running this code?&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Dec 2014 05:08:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017821#M19594</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2014-12-18T05:08:47Z</dc:date>
    </item>
    <item>
      <title>Hi</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017822#M19595</link>
      <description>&lt;P&gt;Hi&lt;/P&gt;

&lt;P&gt;Gennady&lt;/P&gt;

&lt;P&gt;I have tested the code with matrices 1070*1000000 (nzmax = 85534075) multiplied by a matrix 1000000*1070 (nzmax = 85534075) on (I do not know generation info for them):&lt;/P&gt;

&lt;OL&gt;
	&lt;LI&gt;2.8 GHz Intel Core i7, 16GB RAM&lt;/LI&gt;
	&lt;LI&gt;3.4 GHz Intel Core i7, 16GB RAM&lt;/LI&gt;
	&lt;LI&gt;2.80 Ghz Intel (RR) Xeon, 96 GB RAM&lt;/LI&gt;
	&lt;LI&gt;2.4 Ghz Intel i7, 8GB RAM.&lt;/LI&gt;
&lt;/OL&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;You can test the code for your own matrices. &amp;nbsp;I have to refactor the code .&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Vineet&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Dec 2014 17:55:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017822#M19595</guid>
      <dc:creator>Vineet_Y_</dc:creator>
      <dc:date>2014-12-18T17:55:09Z</dc:date>
    </item>
    <item>
      <title>thanks Vineet. Your</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017823#M19596</link>
      <description>&lt;P&gt;thanks Vineet. Your suggestions looks reasonable to implement. I will bring this request to the MKL release board for consideration and may be for the implementation into one of the future versions.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;regards, Gennady&lt;/P&gt;</description>
      <pubDate>Fri, 19 Dec 2014 11:24:33 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Sparse-Sparse-Matrix-Multiplication/m-p/1017823#M19596</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2014-12-19T11:24:33Z</dc:date>
    </item>
  </channel>
</rss>

