<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic I used only elements from in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134759#M25943</link>
    <description>&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;I used only elements from upper triangle of matrix to build&amp;nbsp; Perm vector. Now decomposition with low rank takes 0.146 seconds, when without low rank 0.15 seconds (&lt;/SPAN&gt;the acceleration is not more than 3%&lt;SPAN style="font-size: 13.008px;"&gt;). Although complexity of decomposition without low rank O (n ^ 3), and with low rank O (n ^ 2), so I cannot understand why decomposition with low rank update still slow.&lt;BR /&gt;
	&lt;BR /&gt;
	Updated: problem solved, old version of Intel MKL library.&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Wed, 06 Jun 2018 07:48:00 GMT</pubDate>
    <dc:creator>Limansky__Alexander</dc:creator>
    <dc:date>2018-06-06T07:48:00Z</dc:date>
    <item>
      <title>Pardiso Low rank Update does not accelerate the decomposition</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134756#M25940</link>
      <description>&lt;P&gt;Hello, I have a problem: Pardiso Low rank Update does not accelerate the decomposition. The usual pardiso decomposition is faster than decomposition with low rank update.&lt;BR /&gt;
	&lt;BR /&gt;
	I have a complex symmetric matrix with number of nonzeros in factor about 2 millions.&amp;nbsp;&lt;BR /&gt;
	&lt;BR /&gt;
	Only 400 elements of matrix was changed, structure of matrix wasn't changed.&lt;BR /&gt;
	&lt;BR /&gt;
	Initialization:&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;	/* -------------------------------------------------------------------- */
	/* .. Setup Pardiso control parameters. */
	/* -------------------------------------------------------------------- */
	mtype = 6; /* Complex symmetric matrix */
	for (i = 0; i &amp;lt; 64; i++)
	{
		iparm&lt;I&gt; = 0;
	}
	iparm[0] = 1; // No solver default */
				  //iparm[1] = 2; // Fill-in reordering from METIS
	iparm[1] = 2;   //parallel(OpenMP) version of the nested dissection algorithm

					// Numbers of processors, value of OMP_NUM_THREADS
	iparm[2] = 0;
	iparm[3] = 0; // No iterative-direct algorithm 
	iparm[4] = 0; // No user fill-in reducing permutation 
	iparm[5] = 0; // Write solution into x 
	iparm[6] = 0; // Not in use 
	iparm[7] = 0; // Max numbers of iterative refinement steps 
	iparm[8] = 0; // Not in use 
	iparm[9] = 8; // Perturb the pivot elements with 1E-8
	iparm[10] = 0; // Use nonsymmetric permutation and scaling MPS 
	iparm[11] = 0; // Not in use 
	iparm[12] = 0; // Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy 
	iparm[13] = 0; // Output: Number of perturbed pivots 
	iparm[14] = 0; // Not in use 
	iparm[15] = 0; // Not in use 
	iparm[16] = 0; // Not in use 
	iparm[17] = -1; // Output: Number of nonzeros in the factor LU 
	iparm[18] = -1; // Output: Mflops for LU factorization 
	iparm[19] = 0; // Output: Numbers of CG Iterations 
	iparm[20] = 1;   // Apply 1x1 and 2x2 Bunch and Kaufman pivoting during the factorization process
	iparm[23] = 10;  // PARDISO uses new two - level factorization algorithm
	iparm[24] = 2; //Parallel forward/backward solve control. Intel MKL PARDISO uses a parallel algorithm for the solve step.
	iparm[26] = 1;
	iparm[30] = 0; // Partial solution
	iparm[34] = 1; //zero-based index


	maxfct = 1; // Maximum number of numerical factorizations. 
	mnum = 1; // Which factorization to use. 
	msglvl = 0; // Print statistical information in file 
	error = 0; // Initialize error flag 
			   /* -------------------------------------------------------------------- */
			   /* .. Initialize the internal solver memory pointer. This is only */
			   /* necessary for the FIRST call of the PARDISO solver. */
			   /* -------------------------------------------------------------------- */
	for (i = 0; i &amp;lt; 64; i++)
	{
		pt&lt;I&gt; = 0;
	}

	/* -------------------------------------------------------------------- */
	/* .. Reordering and Symbolic Factorization. This step also allocates */
	/* all memory that is necessary for the factorization. */
	/* -------------------------------------------------------------------- */

	phase = 11;
	PARDISO(pt, &amp;amp;maxfct, &amp;amp;mnum, &amp;amp;mtype, &amp;amp;phase,
		&amp;amp;nRows, complexValues, rowIndex, columns, &amp;amp;idum, &amp;amp;nRhs,
		iparm, &amp;amp;msglvl, &amp;amp;ddum, &amp;amp;ddum, &amp;amp;error);

	printf("\nReordering completed ...\n");
	printf("Number of nonzeros in factors = %d\n", iparm[17]);
	printf("Number of factorization MFLOPS = %d\n\n", iparm[18]);

	if (error != 0)
	{
		printf("\nERROR during symbolic factorization: %d", error);
		exit(1);
	}&lt;/I&gt;&lt;/I&gt;&lt;/PRE&gt;

&lt;P&gt;Then, decompose with original matrix:&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;	// -------------------------------------------------------------------- 
	// .. Numerical factorization. 
	// -------------------------------------------------------------------- 

	phase = 22;
	PARDISO(pt, &amp;amp;maxfct, &amp;amp;mnum, &amp;amp;mtype, &amp;amp;phase,
		&amp;amp;nRows, complexValues, rowIndex, columns, &amp;amp;idum, &amp;amp;nRhs,
		iparm, &amp;amp;msglvl, &amp;amp;ddum, &amp;amp;ddum, &amp;amp;error);

	if (error != 0)
	{
		printf("\nERROR during numerical factorization: %d", error);
		exit(2);
	}&lt;/PRE&gt;

&lt;P&gt;And after that after solving system, decompose with changed elements in matrix&lt;BR /&gt;
	&amp;nbsp;&lt;/P&gt;

&lt;PRE class="brush:cpp;"&gt;	// -------------------------------------------------------------------- 
	// .. Numerical factorization. 
	// -------------------------------------------------------------------- 

	phase = 22;
	iparm[38] = 1;

	PARDISO(pt, &amp;amp;maxfct, &amp;amp;mnum, &amp;amp;mtype, &amp;amp;phase,
		&amp;amp;nRows, complexValues, rowIndex, columns, perm, &amp;amp;nRhs,
		iparm, &amp;amp;msglvl, &amp;amp;ddum, &amp;amp;ddum, &amp;amp;error);

	if (error != 0)
	{
		printf("\nERROR during numerical factorization: %d", error);
		exit(2);
	}

	iparm[38] = 0;&lt;/PRE&gt;

&lt;P&gt;Vector perm contains row and columns indexes of changed elements in all matrix. But I have complex symmetric matrix, should I build vector perm only with changed elements in upper triangle?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 07:57:58 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134756#M25940</guid>
      <dc:creator>Limansky__Alexander</dc:creator>
      <dc:date>2018-06-05T07:57:58Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt; "The usual pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134757#M25941</link>
      <description>&lt;P&gt;&amp;gt;&amp;gt; "&lt;SPAN style="font-size: 12px;"&gt;The usual pardiso decomposition is faster than decomposition with low rank update"&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px;"&gt;What is the&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px;"&gt;the execution time for 22 phase do you see with this case? what is the problem size ( size of this matrix) ?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;gt;&amp;gt;&amp;nbsp;&lt;SPAN style="font-size: 12px;"&gt;But I have complex symmetric matrix, should I build vector perm only with changed elements in upper triangle?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;yes, you should use only&amp;nbsp; elements from upper matrix.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 18:48:12 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134757#M25941</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2018-06-05T18:48:12Z</dc:date>
    </item>
    <item>
      <title>&gt;&gt; "What is the the execution</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134758#M25942</link>
      <description>&lt;P&gt;&amp;gt;&amp;gt; "&lt;SPAN style="font-size: 12px;"&gt;What is the&amp;nbsp;the execution time for 22 phase do you see with this case? what is the problem size ( size of this matrix) ?&amp;nbsp;&lt;/SPAN&gt;"&lt;BR /&gt;
	Execution time of usual pardiso decomposition about 0.15 sec and with low-rank update about 0.2 sec. Size of matrix is 64382.&lt;BR /&gt;
	&amp;gt;&amp;gt; "&lt;SPAN style="font-size: 12px;"&gt;yes, you should use only&amp;nbsp; elements from upper matrix.&lt;/SPAN&gt;"&lt;BR /&gt;
	Okay, I will try, thank you.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Jun 2018 19:01:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134758#M25942</guid>
      <dc:creator>Limansky__Alexander</dc:creator>
      <dc:date>2018-06-05T19:01:10Z</dc:date>
    </item>
    <item>
      <title>I used only elements from</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134759#M25943</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;I used only elements from upper triangle of matrix to build&amp;nbsp; Perm vector. Now decomposition with low rank takes 0.146 seconds, when without low rank 0.15 seconds (&lt;/SPAN&gt;the acceleration is not more than 3%&lt;SPAN style="font-size: 13.008px;"&gt;). Although complexity of decomposition without low rank O (n ^ 3), and with low rank O (n ^ 2), so I cannot understand why decomposition with low rank update still slow.&lt;BR /&gt;
	&lt;BR /&gt;
	Updated: problem solved, old version of Intel MKL library.&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Jun 2018 07:48:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134759#M25943</guid>
      <dc:creator>Limansky__Alexander</dc:creator>
      <dc:date>2018-06-06T07:48:00Z</dc:date>
    </item>
    <item>
      <title>then we have to have the</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134760#M25944</link>
      <description>&lt;P&gt;then we have to have the reproducer of this case. could you please share. thanks.&lt;/P&gt;</description>
      <pubDate>Thu, 07 Jun 2018 02:19:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Low-rank-Update-does-not-accelerate-the-decomposition/m-p/1134760#M25944</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2018-06-07T02:19:05Z</dc:date>
    </item>
  </channel>
</rss>

