<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Two questions about 'iparm[1]=10' &amp; 'cluster_sparse_solver' speed in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062279#M21780</link>
    <description>&lt;P&gt;Hi dear Intel&lt;BR /&gt;
	I'm the user of 'MKL2017 update 1' and 'MPICH3.1.4'.&lt;BR /&gt;
	Now a days, I tried to solve the large SPD sparse matrix containing on the order 10^8 rows.&lt;BR /&gt;
	Therefore, I'm troubled with reducing the process time.&lt;/P&gt;

&lt;P&gt;In MKL2017 version, newly introduced parameter, iparm[1]=10, seems to be helping me.&lt;BR /&gt;
	However, I can not find any other example or instruction about this new parameter.&lt;/P&gt;

&lt;P&gt;I tried to conduct the example code involved in the MKL applying this new parameter, but this code was stopped with no message.&lt;BR /&gt;
	Could you please show me a good example using 'iparm[1]=10'?&lt;/P&gt;

&lt;P&gt;Thank you very much in advance!!!&lt;/P&gt;

&lt;P&gt;Regards,&lt;BR /&gt;
	Yong-hee&lt;/P&gt;

&lt;P&gt;P.S. In large SPD sparse matrix solving, 'cluster_sparse_solver_64' with MPI shows me so further slow result than the result with OpenMP at the same number of activated core. (OpenMP uses 'pardiso_64')&lt;BR /&gt;
	Is this a general situation in solving matrix with MPI?&lt;BR /&gt;
	And is there a way to increase the speed of solver for very large matrix using MPI better than OpenMP?&lt;/P&gt;</description>
    <pubDate>Fri, 06 Jan 2017 06:39:07 GMT</pubDate>
    <dc:creator>YONGHEE_L_</dc:creator>
    <dc:date>2017-01-06T06:39:07Z</dc:date>
    <item>
      <title>Two questions about 'iparm[1]=10' &amp; 'cluster_sparse_solver' speed</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062279#M21780</link>
      <description>&lt;P&gt;Hi dear Intel&lt;BR /&gt;
	I'm the user of 'MKL2017 update 1' and 'MPICH3.1.4'.&lt;BR /&gt;
	Now a days, I tried to solve the large SPD sparse matrix containing on the order 10^8 rows.&lt;BR /&gt;
	Therefore, I'm troubled with reducing the process time.&lt;/P&gt;

&lt;P&gt;In MKL2017 version, newly introduced parameter, iparm[1]=10, seems to be helping me.&lt;BR /&gt;
	However, I can not find any other example or instruction about this new parameter.&lt;/P&gt;

&lt;P&gt;I tried to conduct the example code involved in the MKL applying this new parameter, but this code was stopped with no message.&lt;BR /&gt;
	Could you please show me a good example using 'iparm[1]=10'?&lt;/P&gt;

&lt;P&gt;Thank you very much in advance!!!&lt;/P&gt;

&lt;P&gt;Regards,&lt;BR /&gt;
	Yong-hee&lt;/P&gt;

&lt;P&gt;P.S. In large SPD sparse matrix solving, 'cluster_sparse_solver_64' with MPI shows me so further slow result than the result with OpenMP at the same number of activated core. (OpenMP uses 'pardiso_64')&lt;BR /&gt;
	Is this a general situation in solving matrix with MPI?&lt;BR /&gt;
	And is there a way to increase the speed of solver for very large matrix using MPI better than OpenMP?&lt;/P&gt;</description>
      <pubDate>Fri, 06 Jan 2017 06:39:07 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062279#M21780</guid>
      <dc:creator>YONGHEE_L_</dc:creator>
      <dc:date>2017-01-06T06:39:07Z</dc:date>
    </item>
    <item>
      <title>Yong-hee, have you look at cl</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062280#M21781</link>
      <description>&lt;P&gt;Yong-hee, have you look at&amp;nbsp;cl_solver_unsym_distr_c.c example ( mklroot\examples\cluster_sparse_solverc\source\ folder )? This example shows&amp;nbsp;&lt;SPAN style="font-size: 1em;"&gt;the case when initial data (matrix&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;&amp;nbsp; and rhs) are distributed between several MPI processes, final solution is&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;distributed between MPI processes in the same way as they hold initial data.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 08 Jan 2017 04:18:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062280#M21781</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2017-01-08T04:18:59Z</dc:date>
    </item>
    <item>
      <title>At first, I saw the</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062281#M21782</link>
      <description>&lt;P&gt;.&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jan 2017 12:04:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062281#M21782</guid>
      <dc:creator>YONGHEE_L_</dc:creator>
      <dc:date>2017-01-10T12:04:00Z</dc:date>
    </item>
    <item>
      <title>At first, I saw the</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062282#M21783</link>
      <description>&lt;P&gt;At first, I saw the introduction post of 'iparm[1]=10' (&lt;A href="https://software.intel.com/en-us/articles/distributed-nested-dissection-algorithm-for-intel-mkl-parallel-direct-sparse-solver-for"&gt;https://software.intel.com/en-us/articles/distributed-nested-dissection-algorithm-for-intel-mkl-parallel-direct-sparse-solver-for&lt;/A&gt;).&lt;BR /&gt;
	And I misunderstood that 'iparm[1]=10' can seperate the matrix without intersections by specifying the iparm[40] and iparm[41] in each node of cluster.&lt;/P&gt;

&lt;P&gt;Now, the code is working properly with those new parameter, and they show very nice results with respect to memory usage like the graph in the introduction post of 'iparm[1]=10'.&lt;BR /&gt;
	The processing time, however, is increased very much.&lt;BR /&gt;
	Especially reorder time is quintupled in comparison to the result of 'cl_solver_unsym_distr_c.c' with the matrix having 36 million elements. (140 s -&amp;gt; 680 s @ reordering)&lt;/P&gt;

&lt;P&gt;Do you think that I missed important something to improve my cluster code?&lt;BR /&gt;
	Thank you in advance for your support, again. :)&lt;/P&gt;

&lt;P&gt;Regards,&lt;BR /&gt;
	Yong-hee&lt;/P&gt;</description>
      <pubDate>Tue, 10 Jan 2017 12:06:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Two-questions-about-iparm-1-10-cluster-sparse-solver-speed/m-p/1062282#M21783</guid>
      <dc:creator>YONGHEE_L_</dc:creator>
      <dc:date>2017-01-10T12:06:25Z</dc:date>
    </item>
  </channel>
</rss>

