<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Gennady, in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133390#M25816</link>
    <description>&lt;P&gt;Gennady,&lt;/P&gt;

&lt;P&gt;I've tried a handful of different parameters including changing iparm[10] and iparm[12] to 0 with iparm[23]=10. This should enable a two-level factorization algorithm for scalability with less reliability from removing the scaling and weighted matching. The factorization is a bit faster but it is far from scalable.&lt;/P&gt;

&lt;P&gt;I think I will leave this alone at this point. I have to imagine scalability was considered for the cluster version of Pardiso. Is there anything that can get ported to the shared memory version?&lt;/P&gt;

&lt;P&gt;Thanks for your help.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Thu, 18 Oct 2018 06:04:45 GMT</pubDate>
    <dc:creator>Makhija__David</dc:creator>
    <dc:date>2018-10-18T06:04:45Z</dc:date>
    <item>
      <title>Pardiso Threadripper 2990wx versus Ryzen 1700</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133387#M25813</link>
      <description>&lt;P&gt;I have the same multi-physics finite element code generating a matrix. An old machine with a Ryzen 1700 (8 core) is faster than a threadripper 2990wx (32 core). Windows 10, intel64, mkl_rt.lib, and the MKL versions are 2018.1.156 for Ryzen 1700 and 2019.0.117 for Threadripper. I can provide an example matrix if it helps. Here are the options, which are same on both builds:&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;DIV&gt;struct pardiso_struct&lt;/DIV&gt;

&lt;DIV&gt;{&lt;/DIV&gt;

&lt;DIV&gt;void *pt[64];&lt;/DIV&gt;

&lt;DIV&gt;int maxfct{ 1 };&lt;/DIV&gt;

&lt;DIV&gt;int mnum{ 1 };&lt;/DIV&gt;

&lt;DIV&gt;int mtype{ 11 };&lt;/DIV&gt;

&lt;DIV&gt;int n{ 0 };&lt;/DIV&gt;

&lt;DIV&gt;int idum{ 0 }; //dummy not used by PARDISO when iparm(5-1) != 1&lt;/DIV&gt;

&lt;DIV&gt;int nrhs{ 1 };&lt;/DIV&gt;

&lt;DIV&gt;int iparm[64];&lt;/DIV&gt;

&lt;DIV&gt;int msglvl{ 1 };&lt;/DIV&gt;

&lt;DIV&gt;double ddum{ 0. };&lt;/DIV&gt;

&lt;DIV&gt;int error{ 0 };&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;pardiso_struct()&lt;/DIV&gt;

&lt;DIV&gt;{&lt;/DIV&gt;

&lt;DIV&gt;//&lt;SPAN style="white-space:pre"&gt; &lt;/SPAN&gt;fill(pt, pt + 64, void(0)); does not work&lt;/DIV&gt;

&lt;DIV&gt;for (int i = 0; i &amp;lt; 64; ++i)&lt;/DIV&gt;

&lt;DIV&gt;pt&lt;I&gt; = 0;&lt;/I&gt;&lt;/DIV&gt;

&lt;DIV&gt;std::fill(iparm, iparm + 64, 0);&lt;/DIV&gt;

&lt;DIV&gt;iparm[0] = 1; // 0 for all default, !=0 for any custom&lt;/DIV&gt;

&lt;DIV&gt;iparm[1] = 3; // 0 minimum degree alg, metis, 3 openMP metis&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; //iparm[2] // reserved&lt;/DIV&gt;

&lt;DIV&gt;iparm[3] = 0; // For iterative methods&lt;/DIV&gt;

&lt;DIV&gt;iparm[4] = 0; // user fill-in reducing permutation&lt;/DIV&gt;

&lt;DIV&gt;iparm[5] = 0; // 0 - solution written on x, 1 - solution on b&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; //iparm[6] output of number of iterative refinement steps&lt;/DIV&gt;

&lt;DIV&gt;iparm[7] = 0; // iterative refinement steps&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; //iparm[8] reserved&lt;/DIV&gt;

&lt;DIV&gt;iparm[9] = 13; // pivoting, 13 for nonsymmetric, 8 for sym&lt;/DIV&gt;

&lt;DIV&gt;iparm[10] = 1; // 0 no scaling, 1 scaling (1 Default for nonsym)&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;iparm[12] = 1; // 0 to disable weighted matching? 1 default for non-sym&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[13]-iparm[19] outputs&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[20] = special pivoting for symmetric but indefinite&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[21] output for number of pos eigs&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[22] output for number of neg eigs&lt;/DIV&gt;

&lt;DIV&gt;iparm[23] = 1; // 0 for classic alg, 1 for openMP scalable &amp;gt; 8 procs&lt;/DIV&gt;

&lt;DIV&gt;iparm[24] = 0; // 0 for parallel solve, 1 for sequential solve&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[25] // reserved&lt;/DIV&gt;

&lt;DIV&gt;iparm[26] = 0; // 0 Do not check sparse mat, 1 check sparse mat&lt;/DIV&gt;

&lt;DIV&gt;iparm[27] = 0; // 0 double precision, 1 single precision&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[28]&amp;nbsp; reserved;&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[29] output zero or neg pivots in sym&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[30] only solve for certain components...?&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[31][32] reserved&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[33] some reproduceability stuff&lt;/DIV&gt;

&lt;DIV&gt;iparm[34] = 1; //0 one based indexing, 1 zero based indexing&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[35] something with schur complements&lt;/DIV&gt;

&lt;DIV&gt;iparm[36] = 0; //0 CSR, &amp;gt;0 BSR, &amp;lt;0 convert to BSR&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp; &amp;nbsp;//iparm[59] ooc options&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;}&lt;/DIV&gt;

&lt;DIV&gt;};&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;The results of reorder and factorization are here. Solve (omitted here) is slower on 2990wx but the main concern is numerical factorization time.&lt;/SPAN&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;*************** Ryzen 7 1700 **********************&lt;/P&gt;

&lt;P&gt;=== PARDISO: solving a real nonsymmetric system ===&lt;BR /&gt;
	0-based array is turned ON&lt;BR /&gt;
	PARDISO double precision computation is turned ON&lt;BR /&gt;
	Parallel METIS algorithm at reorder step is turned ON&lt;BR /&gt;
	Scaling is turned ON&lt;BR /&gt;
	Matching is turned ON&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	Summary: ( reordering phase )&lt;BR /&gt;
	================&lt;/P&gt;

&lt;P&gt;Times:&lt;BR /&gt;
	======&lt;BR /&gt;
	Time spent in calculations of symmetric matrix portrait (fulladj): 0.847928 s&lt;BR /&gt;
	Time spent in reordering of the initial matrix (reorder)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 7.678907 s&lt;BR /&gt;
	Time spent in symbolic factorization (symbfct)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 2.075314 s&lt;BR /&gt;
	Time spent in data preparations for factorization (parlist)&amp;nbsp; &amp;nbsp; &amp;nbsp; : 0.098494 s&lt;BR /&gt;
	Time spent in allocation of internal data structures (malloc)&amp;nbsp; &amp;nbsp; : 4.281882 s&lt;BR /&gt;
	Time spent in additional calculations&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 3.785140 s&lt;BR /&gt;
	Total time spent&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 18.767665 s&lt;/P&gt;

&lt;P&gt;Statistics:&lt;BR /&gt;
	===========&lt;BR /&gt;
	Parallel Direct Factorization is running on 8 OpenMP&lt;/P&gt;

&lt;P&gt;&amp;lt; Linear system Ax = b &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of equations:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1928754&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A:&amp;nbsp; &amp;nbsp; &amp;nbsp; 46843184&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A (%): 0.001259&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of right-hand sides:&amp;nbsp; &amp;nbsp; 1&lt;/P&gt;

&lt;P&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of columns for each panel: 72&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of independent subgraphs:&amp;nbsp; 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of supernodes:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 795666&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;size of largest supernode:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;9159&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 673935341&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 631031607&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L+U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1304966948&lt;/P&gt;

&lt;P&gt;=== PARDISO: solving a real nonsymmetric system ===&lt;BR /&gt;
	Two-level factorization algorithm is turned ON&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	Summary: ( factorization phase )&lt;BR /&gt;
	================&lt;/P&gt;

&lt;P&gt;Times:&lt;BR /&gt;
	======&lt;BR /&gt;
	Time spent in copying matrix to internal data structure (A to LU): 0.000000 s&lt;BR /&gt;
	Time spent in factorization step (numfct)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 53.846398 s&lt;BR /&gt;
	Time spent in allocation of internal data structures (malloc)&amp;nbsp; &amp;nbsp; : 0.000878 s&lt;BR /&gt;
	Time spent in additional calculations&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 0.000001 s&lt;BR /&gt;
	Total time spent&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 53.847277 s&lt;/P&gt;

&lt;P&gt;Statistics:&lt;BR /&gt;
	===========&lt;BR /&gt;
	Parallel Direct Factorization is running on 8 OpenMP&lt;/P&gt;

&lt;P&gt;&amp;lt; Linear system Ax = b &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of equations:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1928754&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A:&amp;nbsp; &amp;nbsp; &amp;nbsp; 46843184&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A (%): 0.001259&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of right-hand sides:&amp;nbsp; &amp;nbsp; 1&lt;/P&gt;

&lt;P&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of columns for each panel: 72&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of independent subgraphs:&amp;nbsp; 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of supernodes:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 795666&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;size of largest supernode:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;9159&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 673935341&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 631031607&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L+U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1304966948&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;gflop&amp;nbsp; &amp;nbsp;for the numerical factorization: 2903.934836&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;gflop/s for the numerical factorization: 53.929973&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;****************** Threadripper 2990wx *********************************&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	=== PARDISO: solving a real nonsymmetric system ===&lt;BR /&gt;
	0-based array is turned ON&lt;BR /&gt;
	PARDISO double precision computation is turned ON&lt;BR /&gt;
	Parallel METIS algorithm at reorder step is turned ON&lt;BR /&gt;
	Scaling is turned ON&lt;BR /&gt;
	Matching is turned ON&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	Summary: ( reordering phase )&lt;BR /&gt;
	================&lt;/P&gt;

&lt;P&gt;Times:&lt;BR /&gt;
	======&lt;BR /&gt;
	Time spent in calculations of symmetric matrix portrait (fulladj): 0.919861 s&lt;BR /&gt;
	Time spent in reordering of the initial matrix (reorder)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 10.085178 s&lt;BR /&gt;
	Time spent in symbolic factorization (symbfct)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 2.207123 s&lt;BR /&gt;
	Time spent in data preparations for factorization (parlist)&amp;nbsp; &amp;nbsp; &amp;nbsp; : 0.101967 s&lt;BR /&gt;
	Time spent in allocation of internal data structures (malloc)&amp;nbsp; &amp;nbsp; : 3.143640 s&lt;BR /&gt;
	Time spent in additional calculations&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 3.677500 s&lt;BR /&gt;
	Total time spent&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 20.135269 s&lt;/P&gt;

&lt;P&gt;Statistics:&lt;BR /&gt;
	===========&lt;BR /&gt;
	Parallel Direct Factorization is running on 32 OpenMP&lt;/P&gt;

&lt;P&gt;&amp;lt; Linear system Ax = b &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of equations:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1928754&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A:&amp;nbsp; &amp;nbsp; &amp;nbsp; 46843184&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A (%): 0.001259&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of right-hand sides:&amp;nbsp; &amp;nbsp; 1&lt;/P&gt;

&lt;P&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of columns for each panel: 72&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of independent subgraphs:&amp;nbsp; 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of supernodes:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 794723&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;size of largest supernode:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7005&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 683894639&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 640539323&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L+U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1324433962&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	=== PARDISO: solving a real nonsymmetric system ===&lt;BR /&gt;
	Two-level factorization algorithm is turned ON&lt;/P&gt;

&lt;P&gt;&lt;BR /&gt;
	Summary: ( factorization phase )&lt;BR /&gt;
	================&lt;/P&gt;

&lt;P&gt;Times:&lt;BR /&gt;
	======&lt;BR /&gt;
	Time spent in copying matrix to internal data structure (A to LU): 0.000000 s&lt;BR /&gt;
	Time spent in factorization step (numfct)&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 61.520888 s&lt;BR /&gt;
	Time spent in allocation of internal data structures (malloc)&amp;nbsp; &amp;nbsp; : 0.001112 s&lt;BR /&gt;
	Time spent in additional calculations&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; : 0.000002 s&lt;BR /&gt;
	Total time spent&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;: 61.522003 s&lt;/P&gt;

&lt;P&gt;Statistics:&lt;BR /&gt;
	===========&lt;BR /&gt;
	Parallel Direct Factorization is running on 32 OpenMP&lt;/P&gt;

&lt;P&gt;&amp;lt; Linear system Ax = b &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of equations:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1928754&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A:&amp;nbsp; &amp;nbsp; &amp;nbsp; 46843184&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in A (%): 0.001259&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of right-hand sides:&amp;nbsp; &amp;nbsp; 1&lt;/P&gt;

&lt;P&gt;&amp;lt; Factors L and U &amp;gt;&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of columns for each panel: 72&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of independent subgraphs:&amp;nbsp; 0&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of supernodes:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 794723&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;size of largest supernode:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;7005&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 683894639&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 640539323&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;number of non-zeros in L+U:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 1324433962&lt;BR /&gt;
	&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;gflop&amp;nbsp; &amp;nbsp;for the numerical factorization: 2879.931235&lt;/P&gt;

&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;gflop/s for the numerical factorization: 46.812250&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Nearly 2 million unknowns should have enough work for each core. Manually specifying a max of 16 threads shows a modest speedup (53 seconds for numerical factorization), which suggests to me that this is a Pardiso scaling issue and not a hardware issue. Although, it may be due to the memory architecture of the 2990wx.&lt;/P&gt;

&lt;P&gt;Any suggestions?&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 22 Sep 2018 08:11:03 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133387#M25813</guid>
      <dc:creator>Makhija__David</dc:creator>
      <dc:date>2018-09-22T08:11:03Z</dc:date>
    </item>
    <item>
      <title>This could be some problem</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133388#M25814</link>
      <description>&lt;DIV&gt;summarizing:&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN style="font-size: 13.008px;"&gt;number of equations:&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;1928754&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN style="font-size: 13.008px;"&gt;number of non-zeros in A:&amp;nbsp; &amp;nbsp; &amp;nbsp; 46843184&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;DIV&gt;Rizen: &lt;SPAN style="white-space: pre;"&gt; &lt;/SPAN&gt;8 threads,&amp;nbsp; &amp;nbsp;Total time : 53.9 sec&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN style="font-size: 13.008px;"&gt;Threadripper 2990wx:&amp;nbsp; &amp;nbsp; 32 threads, Total time : 61.5 sec&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;is that correct?&lt;/P&gt;

&lt;P&gt;This could be some problem within mkl pardiso.&amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;Could you try to take some blas ( dgemm, as an example ) function and run the test on both of these systems with set MKL_VERBOSE=1 mode and share the output?&amp;nbsp; &amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em;"&gt;T&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em;"&gt;he output will show which MKL branch of the code has been called.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 28 Sep 2018 09:45:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133388#M25814</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2018-09-28T09:45:00Z</dc:date>
    </item>
    <item>
      <title>Gennady,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133389#M25815</link>
      <description>&lt;P&gt;Gennady,&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px;"&gt;Is this what you were looking for? I added some timing/scaling tests as well.&amp;nbsp;&lt;/SPAN&gt;The matrix might be a bit different than the one in the original post.&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;********* Ryzen 7 (8 core) ******************&lt;/P&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;MKL_VERBOSE Intel(R) MKL 2019.0 Product build 20180829 for Intel(R) 64 architecture Intel(R) Architecture processors, Win 3.28GHz cdecl intel_thread&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;MKL_VERBOSE DGEMM(N,N,1000,2000,200,&lt;WBR /&gt;0000001B6551F8C0,&lt;WBR /&gt;00000149531FA080,1000,&lt;WBR /&gt;0000014952ED6080,200,&lt;WBR /&gt;0000001B6551F8E8,&lt;WBR /&gt;000001495339D080,1000) 105.55ms CNR:OFF Dyn:1 FastMM:1 TID:0&amp;nbsp; NThr:8&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN class="im" style="color: rgb(80, 0, 80); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Number of equations is 1930194&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN class="im" style="color: rgb(80, 0, 80); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Scaling for factorization&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;DIV&gt;&lt;SPAN class="im" style="color: rgb(80, 0, 80); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Core Count&amp;nbsp; &amp;nbsp; Time&amp;nbsp; &amp;nbsp; Theoretical&amp;nbsp; &amp;nbsp;Observed&lt;/SPAN&gt;&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 179.462&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2.16092&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 101.963&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3.80336&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 82.1742&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;4.71928&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 8&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 73.6192&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;8&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.26769&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;P&gt;************ 2990wx (32 core) ****************&lt;/P&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;MKL_VERBOSE Intel(R) MKL 2019.0 Product build 20180829 for Intel(R) 64 architecture Intel(R) Architecture processors, Win 3.68GHz cdecl intel_thread&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;MKL_VERBOSE DGEMM(N,N,1000,2000,200,&lt;WBR /&gt;0000008F20D7F6B0,&lt;WBR /&gt;00000270E5FD2080,1000,&lt;WBR /&gt;00000270E5CBC080,200,&lt;WBR /&gt;0000008F20D7F6D8,&lt;WBR /&gt;00000270E616E080,1000) 116.24ms CNR:OFF Dyn:1 FastMM:1 TID:0&amp;nbsp; NThr:32&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Number of equations is 1930194&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp;&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Scaling for factorization&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;Core Count&amp;nbsp; &amp;nbsp; Time&amp;nbsp; &amp;nbsp; Theoretical&amp;nbsp; &amp;nbsp;Observed&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 179.644&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;2.1443&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 103.245&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;4&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;3.73103&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 76.1586&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.05802&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 8&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 63.5531&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;8&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.06125&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 10&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 57.3066&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;10&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.72193&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 12&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 57.6335&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;12&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.68381&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 14&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 57.4599&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;14&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.704&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 16&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 58.8203&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;16&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.54895&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 18&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 62.5855&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;18&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.15496&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 20&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 63.6602&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;20&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;6.05106&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 22&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 64.9800&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;22&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.92815&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 24&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 67.3924&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;24&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.71594&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 26&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 68.8016&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;26&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.59887&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 28&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 68.8782&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;28&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.59264&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 30&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 65.4657&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;30&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.88417&lt;/DIV&gt;

&lt;DIV style="color: rgb(34, 34, 34); font-family: Arial, Helvetica, sans-serif; font-size: small;"&gt;&amp;nbsp; &amp;nbsp; 32&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; 70.5859&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;32&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;5.45734&lt;/DIV&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 30 Sep 2018 00:04:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133389#M25815</guid>
      <dc:creator>Makhija__David</dc:creator>
      <dc:date>2018-09-30T00:04:18Z</dc:date>
    </item>
    <item>
      <title>Gennady,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133390#M25816</link>
      <description>&lt;P&gt;Gennady,&lt;/P&gt;

&lt;P&gt;I've tried a handful of different parameters including changing iparm[10] and iparm[12] to 0 with iparm[23]=10. This should enable a two-level factorization algorithm for scalability with less reliability from removing the scaling and weighted matching. The factorization is a bit faster but it is far from scalable.&lt;/P&gt;

&lt;P&gt;I think I will leave this alone at this point. I have to imagine scalability was considered for the cluster version of Pardiso. Is there anything that can get ported to the shared memory version?&lt;/P&gt;

&lt;P&gt;Thanks for your help.&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 18 Oct 2018 06:04:45 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-Threadripper-2990wx-versus-Ryzen-1700/m-p/1133390#M25816</guid>
      <dc:creator>Makhija__David</dc:creator>
      <dc:date>2018-10-18T06:04:45Z</dc:date>
    </item>
  </channel>
</rss>

