<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic performance of MKL pardiso in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815338#M4257</link>
    <description>I see a significant reason for the slowness of the solver. You have about 3 million nonzeros in A, but 90 times that number of nonzeros in the L and U factors of A. Solving N equations with Z nonzero entries may be estimated to take (N.Z) operations, but the fill-in is causing a drastic slowdown.&lt;BR /&gt;&lt;BR /&gt;It indicates the advisability of putting some effort into examining whether the equations have a band structure and whether that bandwidth can be reduced by reordering.&lt;BR /&gt;&lt;BR /&gt;Some background information, on how the equations were originated, may help.</description>
    <pubDate>Sat, 15 Oct 2011 14:33:45 GMT</pubDate>
    <dc:creator>mecej4</dc:creator>
    <dc:date>2011-10-15T14:33:45Z</dc:date>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815331#M4250</link>
      <description>I have a large sparse system with nonlinear equations. I use in-core 
PARDISO, direct solver of MKL 10.3 (evaluation version), to solve 
symmetric (coefficient matrix) and non-symmetric(jacobian matrix) set.&lt;BR /&gt;
&lt;BR /&gt;
My test model comprises of 370000x370000 size coeff and jacobian matrix.
 In single iteration, both set take around 36 and 65 sec respectively 
for phases 11,22 and 33 . System configuration is Xeon 3.46 GHz 4-core 
with 20GB RAM.&lt;BR /&gt;
&lt;BR /&gt;
I would like to solve one iteration of at least 1millionx1million size 
model in 60 sec. including both symmetric and unsymmetric set. When I 
ran Ansys on same system with this size it hardly took 30 sec to solve 4
 iterations (each iteration include one symmetric and one unsymmetric 
set). Any suggestion to increase the performance will be greatly 
appreciated.&lt;BR /&gt;
&lt;BR /&gt;
Thanks,&lt;BR /&gt;
Ashish</description>
      <pubDate>Wed, 12 Oct 2011 09:36:41 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815331#M4250</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-12T09:36:41Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815332#M4251</link>
      <description>&lt;P&gt;Hi Ashish,&lt;BR /&gt;&lt;BR /&gt;Please make sure you are running threaded version of MKL. You should link with libmkl_intel_thread library for this, please refer to MKl link line adviser:&lt;BR /&gt;&lt;A href="http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/"&gt;http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;You may also look how many threads was used by PARDISO by setting msglvl=1.&lt;BR /&gt;&lt;BR /&gt;Regards,&lt;BR /&gt;Konstantin&lt;/P&gt;</description>
      <pubDate>Wed, 12 Oct 2011 09:42:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815332#M4251</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2011-10-12T09:42:25Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815333#M4252</link>
      <description>Hi,&lt;DIV&gt;Could you set msglvl to 1 and include PARDISO output to this topic? It could help us to provide you some advices.&lt;/DIV&gt;&lt;DIV&gt;With best regards,&lt;/DIV&gt;&lt;DIV&gt;Alexander Kalinkin&lt;/DIV&gt;</description>
      <pubDate>Wed, 12 Oct 2011 09:43:20 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815333#M4252</guid>
      <dc:creator>Alexander_K_Intel2</dc:creator>
      <dc:date>2011-10-12T09:43:20Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815334#M4253</link>
      <description>Hello,&lt;BR /&gt;&lt;BR /&gt;I am using threaded MKL and my OS is windows xp. I have following libs included in my project &lt;BR /&gt;mkl_solver_lp64.lib&lt;BR /&gt;mkl_intel_lp64.lib&lt;BR /&gt;mkl_intel_thread.lib&lt;BR /&gt;mkl_core.lib&lt;BR /&gt;libiomp5md.lib&lt;BR /&gt;&lt;BR /&gt;Here is output for unsymmetric jacobian &lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;================ PARDISO: solving a real struct. sym. system ================&lt;BR /&gt;The local (internal) PARDISO version is : 103000116&lt;BR /&gt;1-based array indexing is turned ON&lt;BR /&gt;PARDISO double precision computation is turned ON&lt;BR /&gt;METIS algorithm at reorder step is turned ON&lt;BR /&gt;Scaling is turned ON&lt;BR /&gt;Matching is turned ON&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Summary: ( reordering phase )&lt;BR /&gt;================&lt;BR /&gt;&lt;BR /&gt;Times:&lt;BR /&gt;======&lt;BR /&gt;Time spent in calculations of symmetric matrix portrait(fulladj): 0.009621 s&lt;BR /&gt;Time spent in reordering of the initial matrix(reorder) : 2.682449 s&lt;BR /&gt;Time spent in symbolic factorization(symbfct) : 0.701643 s&lt;BR /&gt;Time spent in data preparations for factorization(parlist) : 0.033879 s&lt;BR /&gt;Time spent in allocation of internal data structures(malloc) : 0.066773 s&lt;BR /&gt;Time spent in additional calculations : 0.336471 s&lt;BR /&gt;Total time spent : 3.830835 s&lt;BR /&gt;&lt;BR /&gt;Statistics:&lt;BR /&gt;===========&lt;BR /&gt;&amp;lt; Parallel Direct Factorization with #processors: &amp;gt; 6&lt;BR /&gt;&amp;lt; Numerical Factorization with BLAS3 and O(n) synchronization &amp;gt;&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Linear system Ax = b&amp;gt;&lt;BR /&gt; #equations: 372745&lt;BR /&gt; #non-zeros in A: 5495763&lt;BR /&gt; non-zeros in A (): 0.003956&lt;BR /&gt;&lt;BR /&gt; #right-hand sides: 1&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt; #columns for each panel: 72&lt;BR /&gt; #independent subgraphs: 0&lt;BR /&gt;&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;&lt;BR /&gt; #supernodes: 154074&lt;BR /&gt; size of largest supernode: 3960&lt;BR /&gt; number of nonzeros in L 268393909&lt;BR /&gt; number of nonzeros in U 260128570&lt;BR /&gt; number of nonzeros in L+U 528522479&lt;BR /&gt;=== PARDISO is running in In-Core mode, because iparam(60)=0 ===&lt;BR /&gt;Percentage of computed non-zeros for LL^T factorization&lt;BR /&gt;0 %&lt;BR /&gt;1 %&lt;BR /&gt;2 %&lt;BR /&gt;3 %&lt;BR /&gt;4 %&lt;BR /&gt;5 %&lt;BR /&gt;6 %&lt;BR /&gt;7 %&lt;BR /&gt;8 %&lt;BR /&gt;9 %&lt;BR /&gt;10 %&lt;BR /&gt;11 %&lt;BR /&gt;12 %&lt;BR /&gt;13 %&lt;BR /&gt;14 %&lt;BR /&gt;15 %&lt;BR /&gt;16 %&lt;BR /&gt;17 %&lt;BR /&gt;18 %&lt;BR /&gt;19 %&lt;BR /&gt;20 %&lt;BR /&gt;21 %&lt;BR /&gt;22 %&lt;BR /&gt;23 %&lt;BR /&gt;24 %&lt;BR /&gt;25 %&lt;BR /&gt;26 %&lt;BR /&gt;27 %&lt;BR /&gt;28 %&lt;BR /&gt;29 %&lt;BR /&gt;30 %&lt;BR /&gt;31 %&lt;BR /&gt;32 %&lt;BR /&gt;33 %&lt;BR /&gt;34 %&lt;BR /&gt;35 %&lt;BR /&gt;36 %&lt;BR /&gt;37 %&lt;BR /&gt;38 %&lt;BR /&gt;39 %&lt;BR /&gt;40 %&lt;BR /&gt;41 %&lt;BR /&gt;42 %&lt;BR /&gt;43 %&lt;BR /&gt;44 %&lt;BR /&gt;45 %&lt;BR /&gt;46 %&lt;BR /&gt;47 %&lt;BR /&gt;48 %&lt;BR /&gt;49 %&lt;BR /&gt;50 %&lt;BR /&gt;51 %&lt;BR /&gt;52 %&lt;BR /&gt;53 %&lt;BR /&gt;54 %&lt;BR /&gt;55 %&lt;BR /&gt;56 %&lt;BR /&gt;57 %&lt;BR /&gt;58 %&lt;BR /&gt;59 %&lt;BR /&gt;60 %&lt;BR /&gt;61 %&lt;BR /&gt;62 %&lt;BR /&gt;63 %&lt;BR /&gt;64 %&lt;BR /&gt;65 %&lt;BR /&gt;66 %&lt;BR /&gt;67 %&lt;BR /&gt;68 %&lt;BR /&gt;69 %&lt;BR /&gt;70 %&lt;BR /&gt;71 %&lt;BR /&gt;73 %&lt;BR /&gt;74 %&lt;BR /&gt;75 %&lt;BR /&gt;76 %&lt;BR /&gt;77 %&lt;BR /&gt;78 %&lt;BR /&gt;79 %&lt;BR /&gt;80 %&lt;BR /&gt;81 %&lt;BR /&gt;82 %&lt;BR /&gt;83 %&lt;BR /&gt;84 %&lt;BR /&gt;85 %&lt;BR /&gt;86 %&lt;BR /&gt;87 %&lt;BR /&gt;88 %&lt;BR /&gt;89 %&lt;BR /&gt;90 %&lt;BR /&gt;91 %&lt;BR /&gt;92 %&lt;BR /&gt;93 %&lt;BR /&gt;94 %&lt;BR /&gt;95 %&lt;BR /&gt;96 %&lt;BR /&gt;97 %&lt;BR /&gt;98 %&lt;BR /&gt;99 %&lt;BR /&gt;100 %&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;================ PARDISO: solving a real struct. sym. system ================&lt;BR /&gt;Single-level factorization algorithm is turned ON&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Summary: ( factorization phase )&lt;BR /&gt;================&lt;BR /&gt;&lt;BR /&gt;Times:&lt;BR /&gt;======&lt;BR /&gt;Time spent in copying matrix to internal data structure(A to LU): 0.000000 s&lt;BR /&gt;Time spent in factorization step(numfct) : 51.146111 s&lt;BR /&gt;Time spent in allocation of internal data structures(malloc) : 0.000339 s&lt;BR /&gt;Time spent in additional calculations : 0.000001 s&lt;BR /&gt;Total time spent : 51.146451 s&lt;BR /&gt;&lt;BR /&gt;Statistics:&lt;BR /&gt;===========&lt;BR /&gt;&amp;lt; Parallel Direct Factorization with #processors: &amp;gt; 6&lt;BR /&gt;&amp;lt; Numerical Factorization with BLAS3 and O(n) synchronization &amp;gt;&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Linear system Ax = b&amp;gt;&lt;BR /&gt; #equations: 372745&lt;BR /&gt; #non-zeros in A: 5495763&lt;BR /&gt; non-zeros in A (): 0.003956&lt;BR /&gt;&lt;BR /&gt; #right-hand sides: 1&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt; #columns for each panel: 72&lt;BR /&gt; #independent subgraphs: 0&lt;BR /&gt;&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;&lt;BR /&gt; #supernodes: 154074&lt;BR /&gt; size of largest supernode: 3960&lt;BR /&gt; number of nonzeros in L 268393909&lt;BR /&gt; number of nonzeros in U 260128570&lt;BR /&gt; number of nonzeros in L+U 528522479&lt;BR /&gt; gflop for the numerical factorization: 1708.338114&lt;BR /&gt;&lt;BR /&gt; gflop/s for the numerical factorization: 33.401134&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;================ PARDISO: solving a real struct. sym. system ================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Summary: ( solution phase )&lt;BR /&gt;================&lt;BR /&gt;&lt;BR /&gt;Times:&lt;BR /&gt;======&lt;BR /&gt;Time spent in direct solver at solve step (solve) : 0.316364 s&lt;BR /&gt;Time spent in additional calculations : 0.631197 s&lt;BR /&gt;Total time spent : 0.947561 s&lt;BR /&gt;&lt;BR /&gt;Statistics:&lt;BR /&gt;===========&lt;BR /&gt;&amp;lt; Parallel Direct Factorization with #processors: &amp;gt; 6&lt;BR /&gt;&amp;lt; Numerical Factorization with BLAS3 and O(n) synchronization &amp;gt;&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Linear system Ax = b&amp;gt;&lt;BR /&gt; #equations: 372745&lt;BR /&gt; #non-zeros in A: 5495763&lt;BR /&gt; non-zeros in A (): 0.003956&lt;BR /&gt;&lt;BR /&gt; #right-hand sides: 1&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt; #columns for each panel: 72&lt;BR /&gt; #independent subgraphs: 0&lt;BR /&gt;&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;&lt;BR /&gt; #supernodes: 154074&lt;BR /&gt; size of largest supernode: 3960&lt;BR /&gt; number of nonzeros in L 268393909&lt;BR /&gt; number of nonzeros in U 260128570&lt;BR /&gt; number of nonzeros in L+U 528522479&lt;BR /&gt; gflop for the numerical factorization: 1708.338114&lt;BR /&gt;&lt;BR /&gt; gflop/s for the numerical factorization: 33.401134&lt;/B&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;I could not print same for symmteric coefficient matrix as output string length exceeded limit of windows command window. Here is partial output. I will try some other way to print it&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;================&lt;BR /&gt;&lt;BR /&gt;Times:&lt;BR /&gt;======&lt;BR /&gt;Time spent in copying matrix to internal data structure(A to LU): 0.000000 s&lt;BR /&gt;Time spent in factorization step(numfct) : 23.408378 s&lt;BR /&gt;Time spent in allocation of internal data structures(malloc) : 0.000514 s&lt;BR /&gt;Time spent in additional calculations : 0.000001 s&lt;BR /&gt;Total time spent : 23.408893 s&lt;BR /&gt;&lt;BR /&gt;Statistics:&lt;BR /&gt;===========&lt;BR /&gt;&amp;lt; Parallel Direct Factorization with #processors: &amp;gt; 6&lt;BR /&gt;&amp;lt; Numerical Factorization with BLAS3 and O(n) synchronization &amp;gt;&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Linear system Ax = b&amp;gt;&lt;BR /&gt; #equations: 372745&lt;BR /&gt; #non-zeros in A: 2934254&lt;BR /&gt; non-zeros in A (): 0.002112&lt;BR /&gt;&lt;BR /&gt; #right-hand sides: 1&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt; #columns for each panel: 96&lt;BR /&gt; #independent subgraphs: 0&lt;BR /&gt;&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;&lt;BR /&gt; #supernodes: 153764&lt;BR /&gt; size of largest supernode: 3960&lt;BR /&gt; number of nonzeros in L 269145349&lt;BR /&gt; number of nonzeros in U 1&lt;BR /&gt; number of nonzeros in L+U 269145350&lt;BR /&gt; gflop for the numerical factorization: 869.989754&lt;BR /&gt;&lt;BR /&gt; gflop/s for the numerical factorization: 37.165742&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;================ PARDISO: solving a symm. posit. def. system ================&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Summary: ( solution phase )&lt;BR /&gt;================&lt;BR /&gt;&lt;BR /&gt;Times:&lt;BR /&gt;======&lt;BR /&gt;Time spent in direct solver at solve step (solve) : 0.298444 s&lt;BR /&gt;Time spent in additional calculations : 0.619958 s&lt;BR /&gt;Total time spent : 0.918402 s&lt;BR /&gt;&lt;BR /&gt;Statistics:&lt;BR /&gt;===========&lt;BR /&gt;&amp;lt; Parallel Direct Factorization with #processors: &amp;gt; 6&lt;BR /&gt;&amp;lt; Numerical Factorization with BLAS3 and O(n) synchronization &amp;gt;&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Linear system Ax = b&amp;gt;&lt;BR /&gt; #equations: 372745&lt;BR /&gt; #non-zeros in A: 2934254&lt;BR /&gt; non-zeros in A (): 0.002112&lt;BR /&gt;&lt;BR /&gt; #right-hand sides: 1&lt;BR /&gt;&lt;BR /&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt; #columns for each panel: 96&lt;BR /&gt; #independent subgraphs: 0&lt;BR /&gt;&amp;lt; Preprocessing with state of the art partitioning metis&amp;gt;&lt;BR /&gt; #supernodes: 153764&lt;BR /&gt; size of largest supernode: 3960&lt;BR /&gt; number of nonzeros in L 269145349&lt;BR /&gt; number of nonzeros in U 1&lt;BR /&gt; number of nonzeros in L+U 269145350&lt;BR /&gt; gflop for the numerical factorization: 869.989754&lt;BR /&gt;&lt;BR /&gt; gflop/s for the numerical factorization: 37.165742&lt;/B&gt;</description>
      <pubDate>Wed, 12 Oct 2011 11:08:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815334#M4253</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-12T11:08:49Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815335#M4254</link>
      <description>Hi,&lt;BR /&gt;&lt;BR /&gt;I am copying part of code which set the input and control flags for both symmetric and unsymmteric solver. This may be helpful.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ashish&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;INPUT AND CONTROL FLAG FOR SYMMETRIC&lt;BR /&gt;&lt;BR /&gt;&lt;B&gt;MKL_INT l_nMTYPE = 2; /* Real symmetric matrix and positive definite*/&lt;BR /&gt; /* RHS and solution vectors. */&lt;BR /&gt; MKL_INT l_nNRHS = 1; /* Number of right hand sides. */&lt;BR /&gt; /* Internal solver memory pointer l_pMemPt, */&lt;BR /&gt; /* 32-bit: int l_pMemPt[64]; 64-bit: long int l_pMemPt[64] */&lt;BR /&gt; /* or void *l_pMemPt[64] should be OK on both architectures */&lt;BR /&gt;#ifdef WIN64&lt;BR /&gt; long int     l_pMemPt[64];&lt;BR /&gt;#else&lt;BR /&gt; int       l_pMemPt[64];&lt;BR /&gt;#endif&lt;BR /&gt; /* Pardiso control parameters. */&lt;BR /&gt; MKL_INT l_nIPARM[64];&lt;BR /&gt; MKL_INT l_nMAXFCT, l_nMNUM, l_nPHASE, l_nERROR, l_nMSGLVL;&lt;BR /&gt; /* Auxiliary variables. */&lt;BR /&gt; double l_dDDUM; /* Double dummy */&lt;BR /&gt; MKL_INT l_nIDUM; /* Integer dummy. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt;/* .. Setup Pardiso control parameters. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt; for(int i = 0; i &amp;lt; 64; i++) &lt;BR /&gt;  l_nIPARM&lt;I&gt; = 0;&lt;BR /&gt; l_nIPARM[0] = 1; /* No solver default */&lt;BR /&gt; l_nIPARM[1] = 2; /* Fill-in reordering from METIS */&lt;BR /&gt; /* Numbers of processors, value of OMP_NUM_THREADS */&lt;BR /&gt; l_nIPARM[2] = 0; //use all the available processors&lt;BR /&gt; l_nIPARM[3] = 0; /* No iterative-direct algorithm */&lt;BR /&gt; l_nIPARM[4] = 0; /* No user fill-in reducing permutation */&lt;BR /&gt; l_nIPARM[5] = 0; /* Write solution into x */&lt;BR /&gt; l_nIPARM[6] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[7] = 2; /* Max numbers of iterative refinement steps */&lt;BR /&gt; l_nIPARM[8] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[9] = 13; /* Perturb the pivot elements with 1E-13 */&lt;BR /&gt; l_nIPARM[10] = 1; /* Use nonsymmetric permutation and scaling MPS */&lt;BR /&gt; l_nIPARM[11] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[12] = 0; /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try l_nIPARM[12] = 1 in case of inappropriate accuracy */&lt;BR /&gt; l_nIPARM[13] = 0; /* Output: Number of perturbed pivots */&lt;BR /&gt; l_nIPARM[14] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[15] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[16] = 0; /* Not in use */&lt;BR /&gt; l_nIPARM[17] = 0; /* Output: Number of nonzeros in the factor LU */&lt;BR /&gt; l_nIPARM[18] = 0; /* Output: Mflops for LU factorization */&lt;BR /&gt; l_nIPARM[19] = 0; /* Output: Numbers of CG Iterations */&lt;BR /&gt;&lt;BR /&gt; l_nMAXFCT = 1; /* Maximum number of numerical factorizations. */&lt;BR /&gt; l_nMNUM = 1; /* Which factorization to use. */&lt;BR /&gt; l_nMSGLVL = 0; /* do not Print statistical information in file */&lt;BR /&gt; l_nERROR = 0; /* Initialize l_nERROR flag */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt;/* .. Initialize the internal solver memory pointer. This is only */&lt;BR /&gt;/* necessary for the FIRST call of the PARDISO solver. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt; for (int i = 0; i &amp;lt; 64; i++) &lt;BR /&gt;  l_pMemPt&lt;I&gt; = 0;&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt;/* .. Reordering and Symbolic Factorization. This step also allocates */&lt;BR /&gt;/* all memory that is necessary for the factorization. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt; l_nPHASE = 11;&lt;BR /&gt; PARDISO (l_pMemPt, &amp;amp;l_nMAXFCT, &amp;amp;l_nMNUM, &amp;amp;l_nMTYPE, &amp;amp;l_nPHASE,&lt;BR /&gt;  &amp;amp;m_nNodeCnt, l_pAX, l_pRowIdx, l_pCol, &amp;amp;l_nIDUM, &amp;amp;l_nNRHS,&lt;BR /&gt;  l_nIPARM, &amp;amp;l_nMSGLVL, &amp;amp;l_dDDUM, &amp;amp;l_dDDUM, &amp;amp;l_nERROR);&lt;/I&gt;&lt;/I&gt;&lt;/B&gt;&lt;I&gt;&lt;I&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;INPUT AND CONTROL FLAG FOR UNSYMMETRIC&lt;BR /&gt;&lt;B&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt;/* .. Initialize the internal solver memory pointer. This is only */&lt;BR /&gt;/* necessary for the FIRST call of the PARDISO solver. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt; m_pMemPt = new MEMPTR[64];&lt;BR /&gt; for (int i = 0; i &amp;lt; 64; i++) &lt;BR /&gt;  m_pMemPt&lt;I&gt; = 0;&lt;BR /&gt;&lt;BR /&gt; m_nMTYPE = 1; /* Real structurally symmetric matrix */&lt;BR /&gt; /* RHS and solution vectors. */&lt;BR /&gt; m_nNRHS = 1; /* Number of right hand sides. */&lt;BR /&gt; /* Pardiso control parameters. */&lt;BR /&gt; m_nIPARM = new MKL_INT[64];&lt;BR /&gt;&lt;BR /&gt; /* -------------------------------------------------------------------- */&lt;BR /&gt;/* .. Pardiso control parameters. */&lt;BR /&gt;/* -------------------------------------------------------------------- */&lt;BR /&gt; for(int i = 0; i &amp;lt; 64; i++) &lt;BR /&gt;  m_nIPARM&lt;I&gt; = 0;&lt;BR /&gt; m_nIPARM[0] = 1; /* No solver default */&lt;BR /&gt; m_nIPARM[1] = 2; /* Fill-in reordering from METIS the solver uses the nested dissection algorithm from the METIS */&lt;BR /&gt; /* Numbers of processors, value of MKL_NUM_THREADS. If the variable MKL_NUM_THREADS is not defined, then the solver uses all available processors */&lt;BR /&gt; m_nIPARM[2] = 0; //currently is not used&lt;BR /&gt; m_nIPARM[3] = 0; /* No iterative-direct algorithm */&lt;BR /&gt; m_nIPARM[4] = 0; /* No user fill-in reducing permutation */&lt;BR /&gt; m_nIPARM[5] = 0; /* Write solution into x */&lt;BR /&gt; m_nIPARM[6] = 0; /* Not in use */&lt;BR /&gt; m_nIPARM[7] = 2; /* Max numbers of iterative refinement steps */&lt;BR /&gt; m_nIPARM[8] = 0; /* This parameter is reserved for future use. Its value must be set to 0 */&lt;BR /&gt; m_nIPARM[9] = 13; /* Perturb the pivot elements with 1E-13 */&lt;BR /&gt; m_nIPARM[10] = 1; /* Use nonsymmetric permutation and scaling MPS */&lt;BR /&gt; m_nIPARM[11] = 0; /* PARDISO solves a linear system Ax = b (default value). */&lt;BR /&gt; m_nIPARM[12] = 1; /* Maximum weighted matching algorithm is switched-off (default for symmetric). Try m_nIPARM[12] = 1 in case of inappropriate accuracy */&lt;BR /&gt; m_nIPARM[13] = 0; /* Output: Number of perturbed pivots */&lt;BR /&gt; m_nIPARM[14] = 0; /* Not in use */&lt;BR /&gt; m_nIPARM[15] = 0; /* Not in use */&lt;BR /&gt; m_nIPARM[16] = 0; /* Not in use */&lt;BR /&gt; m_nIPARM[17] = 0; /* Output: Number of nonzeros in the factor LU */&lt;BR /&gt; m_nIPARM[18] = 0; /* Output: Mflops for LU factorization */&lt;BR /&gt; m_nIPARM[19] = 0; /* Output: Numbers of CG Iterations */&lt;BR /&gt;&lt;BR /&gt; m_nMAXFCT = 1; /* Maximum number of numerical factorizations. */&lt;BR /&gt; m_nMNUM = 1; /* Which factorization to use. */&lt;BR /&gt; m_nMSGLVL = 0; /* do not Print statistical information in file */&lt;BR /&gt; m_nERROR = 0; /* Initialize m_nERROR flag */&lt;BR /&gt;&lt;BR /&gt; m_nPHASE = 11;&lt;BR /&gt;  PARDISO (m_pMemPt, &amp;amp;m_nMAXFCT, &amp;amp;m_nMNUM, &amp;amp;m_nMTYPE, &amp;amp;m_nPHASE,&lt;BR /&gt;   &amp;amp;m_nXSize, m_dAX, m_nRowIdx, m_nCol, &amp;amp;m_nIDUM, &amp;amp;m_nNRHS,&lt;BR /&gt;   m_nIPARM, &amp;amp;m_nMSGLVL, &amp;amp;m_dDDUM, &amp;amp;m_dDDUM, &amp;amp;m_nERROR);&lt;BR /&gt;&lt;/I&gt;&lt;/I&gt;&lt;/B&gt;&lt;I&gt;&lt;I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;&lt;/I&gt;</description>
      <pubDate>Wed, 12 Oct 2011 14:14:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815335#M4254</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-12T14:14:46Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815336#M4255</link>
      <description>Hi,&lt;DIV&gt;Sorry for delay. It's really strange that PARDISO return number of threads is equal to 6 on your 4 cores Xeon. Did you change default number of threads used by MKL?&lt;/DIV&gt;&lt;DIV&gt;With best regards,&lt;/DIV&gt;&lt;DIV&gt;Alexander Kalinkin&lt;/DIV&gt;</description>
      <pubDate>Fri, 14 Oct 2011 05:39:46 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815336#M4255</guid>
      <dc:creator>Alexander_K_Intel2</dc:creator>
      <dc:date>2011-10-14T05:39:46Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815337#M4256</link>
      <description>Hi Alexander,&lt;BR /&gt;&lt;BR /&gt;Thanks for reply.&lt;BR /&gt;&lt;BR /&gt;Yes. When I first reported this in forum I had changed MKL_NUM_THREADS=4 on this 6-core machine. Later I removed it and thats why you see 6-cores printed by PARDISO.&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ashish&lt;BR /&gt;</description>
      <pubDate>Sat, 15 Oct 2011 11:44:43 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815337#M4256</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-15T11:44:43Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815338#M4257</link>
      <description>I see a significant reason for the slowness of the solver. You have about 3 million nonzeros in A, but 90 times that number of nonzeros in the L and U factors of A. Solving N equations with Z nonzero entries may be estimated to take (N.Z) operations, but the fill-in is causing a drastic slowdown.&lt;BR /&gt;&lt;BR /&gt;It indicates the advisability of putting some effort into examining whether the equations have a band structure and whether that bandwidth can be reduced by reordering.&lt;BR /&gt;&lt;BR /&gt;Some background information, on how the equations were originated, may help.</description>
      <pubDate>Sat, 15 Oct 2011 14:33:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815338#M4257</guid>
      <dc:creator>mecej4</dc:creator>
      <dc:date>2011-10-15T14:33:45Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815339#M4258</link>
      <description>Hi Ashish,&lt;DIV&gt;One additional question: What kind of solver from Ansys so you use?&lt;/DIV&gt;&lt;DIV&gt;With best regards,&lt;/DIV&gt;&lt;DIV&gt;Alexander Kalinkin&lt;/DIV&gt;</description>
      <pubDate>Sun, 16 Oct 2011 06:08:10 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815339#M4258</guid>
      <dc:creator>Alexander_K_Intel2</dc:creator>
      <dc:date>2011-10-16T06:08:10Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815340#M4259</link>
      <description>Hi mecej4,&lt;BR /&gt;&lt;BR /&gt;Thanks for reply.&lt;BR /&gt;&lt;BR /&gt;These equations are derived from finite element formulation of heat conduction equation using 4 noded TET. Heat conduction equations have convection boundary condition defined for all the boundaries.&lt;BR /&gt;&lt;BR /&gt;Actually, I do use reverse cuthill methd to reduce band width of matrix and all the data I had shared earlier is after bandwidth minimization. I have seen reduction in computaional time after minimization of bandwidth. I am not sure if it will always be benficial to reduce the bandwidth because it may not be useful for sprase solver. &lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ashish&lt;A rel="/en-us/services/profile/quick_profile.php?is_paid=&amp;amp;user_id=9662" class="basic" href="https://community.intel.com/../profile/9662/"&gt;&lt;/A&gt;</description>
      <pubDate>Mon, 17 Oct 2011 07:09:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815340#M4259</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-17T07:09:47Z</dc:date>
    </item>
    <item>
      <title>performance of MKL pardiso</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815341#M4260</link>
      <description>Hi Alexander,&lt;BR /&gt;&lt;BR /&gt;I use Ansys Mechanical to solve nonlinear equations. I chose Sparse Solver and Newton Raphson from the option list of Ansys.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thanks,&lt;BR /&gt;Ashish</description>
      <pubDate>Mon, 17 Oct 2011 07:19:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/performance-of-MKL-pardiso/m-p/815341#M4260</guid>
      <dc:creator>negi__ashish</dc:creator>
      <dc:date>2011-10-17T07:19:39Z</dc:date>
    </item>
  </channel>
</rss>

