- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello, I have a problem: Pardiso Low rank Update does not accelerate the decomposition. The usual pardiso decomposition is faster than decomposition with low rank update.
I have a complex symmetric matrix with number of nonzeros in factor about 2 millions.
Only 400 elements of matrix was changed, structure of matrix wasn't changed.
Initialization:
/* -------------------------------------------------------------------- */ /* .. Setup Pardiso control parameters. */ /* -------------------------------------------------------------------- */ mtype = 6; /* Complex symmetric matrix */ for (i = 0; i < 64; i++) { iparm = 0; } iparm[0] = 1; // No solver default */ //iparm[1] = 2; // Fill-in reordering from METIS iparm[1] = 2; //parallel(OpenMP) version of the nested dissection algorithm // Numbers of processors, value of OMP_NUM_THREADS iparm[2] = 0; iparm[3] = 0; // No iterative-direct algorithm iparm[4] = 0; // No user fill-in reducing permutation iparm[5] = 0; // Write solution into x iparm[6] = 0; // Not in use iparm[7] = 0; // Max numbers of iterative refinement steps iparm[8] = 0; // Not in use iparm[9] = 8; // Perturb the pivot elements with 1E-8 iparm[10] = 0; // Use nonsymmetric permutation and scaling MPS iparm[11] = 0; // Not in use iparm[12] = 0; // Maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm[12] = 1 in case of inappropriate accuracy iparm[13] = 0; // Output: Number of perturbed pivots iparm[14] = 0; // Not in use iparm[15] = 0; // Not in use iparm[16] = 0; // Not in use iparm[17] = -1; // Output: Number of nonzeros in the factor LU iparm[18] = -1; // Output: Mflops for LU factorization iparm[19] = 0; // Output: Numbers of CG Iterations iparm[20] = 1; // Apply 1x1 and 2x2 Bunch and Kaufman pivoting during the factorization process iparm[23] = 10; // PARDISO uses new two - level factorization algorithm iparm[24] = 2; //Parallel forward/backward solve control. Intel MKL PARDISO uses a parallel algorithm for the solve step. iparm[26] = 1; iparm[30] = 0; // Partial solution iparm[34] = 1; //zero-based index maxfct = 1; // Maximum number of numerical factorizations. mnum = 1; // Which factorization to use. msglvl = 0; // Print statistical information in file error = 0; // Initialize error flag /* -------------------------------------------------------------------- */ /* .. Initialize the internal solver memory pointer. This is only */ /* necessary for the FIRST call of the PARDISO solver. */ /* -------------------------------------------------------------------- */ for (i = 0; i < 64; i++) { pt = 0; } /* -------------------------------------------------------------------- */ /* .. Reordering and Symbolic Factorization. This step also allocates */ /* all memory that is necessary for the factorization. */ /* -------------------------------------------------------------------- */ phase = 11; PARDISO(pt, &maxfct, &mnum, &mtype, &phase, &nRows, complexValues, rowIndex, columns, &idum, &nRhs, iparm, &msglvl, &ddum, &ddum, &error); printf("\nReordering completed ...\n"); printf("Number of nonzeros in factors = %d\n", iparm[17]); printf("Number of factorization MFLOPS = %d\n\n", iparm[18]); if (error != 0) { printf("\nERROR during symbolic factorization: %d", error); exit(1); }
Then, decompose with original matrix:
// -------------------------------------------------------------------- // .. Numerical factorization. // -------------------------------------------------------------------- phase = 22; PARDISO(pt, &maxfct, &mnum, &mtype, &phase, &nRows, complexValues, rowIndex, columns, &idum, &nRhs, iparm, &msglvl, &ddum, &ddum, &error); if (error != 0) { printf("\nERROR during numerical factorization: %d", error); exit(2); }
And after that after solving system, decompose with changed elements in matrix
// -------------------------------------------------------------------- // .. Numerical factorization. // -------------------------------------------------------------------- phase = 22; iparm[38] = 1; PARDISO(pt, &maxfct, &mnum, &mtype, &phase, &nRows, complexValues, rowIndex, columns, perm, &nRhs, iparm, &msglvl, &ddum, &ddum, &error); if (error != 0) { printf("\nERROR during numerical factorization: %d", error); exit(2); } iparm[38] = 0;
Vector perm contains row and columns indexes of changed elements in all matrix. But I have complex symmetric matrix, should I build vector perm only with changed elements in upper triangle?
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> "The usual pardiso decomposition is faster than decomposition with low rank update"
What is the the execution time for 22 phase do you see with this case? what is the problem size ( size of this matrix) ?
>> But I have complex symmetric matrix, should I build vector perm only with changed elements in upper triangle?
yes, you should use only elements from upper matrix.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>> "What is the the execution time for 22 phase do you see with this case? what is the problem size ( size of this matrix) ? "
Execution time of usual pardiso decomposition about 0.15 sec and with low-rank update about 0.2 sec. Size of matrix is 64382.
>> "yes, you should use only elements from upper matrix."
Okay, I will try, thank you.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I used only elements from upper triangle of matrix to build Perm vector. Now decomposition with low rank takes 0.146 seconds, when without low rank 0.15 seconds (the acceleration is not more than 3%). Although complexity of decomposition without low rank O (n ^ 3), and with low rank O (n ^ 2), so I cannot understand why decomposition with low rank update still slow.
Updated: problem solved, old version of Intel MKL library.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
then we have to have the reproducer of this case. could you please share. thanks.
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page