<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Hi Karel,  in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073535#M22409</link>
    <description>&lt;P&gt;Hi Karel,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks for the reply.&amp;nbsp;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;As i understand, by default, it is not expected that "&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;&amp;nbsp;24threads is only 10% faster than with 12threads on one processor", but it depends on the solver size, sparsity and cpu memory size etc. &amp;nbsp;So we need to test case to verify.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;If &lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;set&amp;nbsp;msglvl=1&lt;/SPAN&gt;, you will see the solver's message as &amp;nbsp;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/601183" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/601183&lt;/A&gt; attached. or&amp;nbsp;Is there any way for provide us one input (write one into txt file) , so we can did standalone test at our sides?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;We release MKL 11.3.2 this week, we had gotten one performance issue in MKL 11.3 and 11.3.1 , please see&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/intel-mkl-113-bug-fixes-list" target="_blank"&gt;https://software.intel.com/en-us/articles/intel-mkl-113-bug-fixes-list&lt;/A&gt;. would you please try the version and show the performance &lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;comparsion&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;You can get MKL 11.3.2 by intel registration center, &amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://registrationcenter.intel.com/en/" target="_blank"&gt;https://registrationcenter.intel.com/en/&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 22 Feb 2016 02:05:54 GMT</pubDate>
    <dc:creator>Ying_H_Intel</dc:creator>
    <dc:date>2016-02-22T02:05:54Z</dc:date>
    <item>
      <title>Pardiso 24 cores poor scaling</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073532#M22406</link>
      <description>&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;I am using Pardiso on a machine with two twelve-core processors. HT is disabled in BIOS. Upto 12threads the CPU time scales reasonably, but 24threads is only 10% faster than with 12threads on one processor. Is it correct? I am using iparm(24)=1, system variables MKL_DYNAMIC=FALSE, MKL_NUM_THREADS=24 (nothing changes even when these variables are not defined). Pardiso is called from AceFEM. Now, I can only change the parameters, but I can contact the author if needed. Where could be a mistake? It seems like only one processor is used, although task manager shows that all cores are fully loaded.&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;Thanks a lot! Karel&lt;/P&gt;</description>
      <pubDate>Mon, 15 Feb 2016 20:56:06 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073532#M22406</guid>
      <dc:creator>Karel_T_</dc:creator>
      <dc:date>2016-02-15T20:56:06Z</dc:date>
    </item>
    <item>
      <title>Hi Karel,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073533#M22407</link>
      <description>&lt;P&gt;Hi Karel,&lt;/P&gt;

&lt;P style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;1) c&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;ould you please tell some information like&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;which MKL version was using? &amp;nbsp;windows, linux or other, Intel 64 bit or 32bit, C or fortran etc?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;According to MKL manual, &amp;nbsp;iparm[23]&lt;/SPAN&gt;&lt;/P&gt;

&lt;P style="font-size: 13.008px; line-height: 19.512px;"&gt;input&lt;BR /&gt;
	Parallel factorization control.&lt;BR /&gt;
	NOTE&lt;BR /&gt;
	&lt;SPAN style="font-weight: 700;"&gt;The two-level factorization algorithm does not improve performance in OOC mode&lt;/SPAN&gt;.&lt;BR /&gt;
	0* Intel MKL PARDISO uses the classic algorithm for factorization.&lt;BR /&gt;
	1 Intel MKL PARDISO uses a two-level factorization algorithm. This algorithm&lt;BR /&gt;
	generally improves scalability in case of parallel factorization on many OpenMP&lt;/P&gt;

&lt;P style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;2) How was the input sparse matrix size? You may set &amp;nbsp;&lt;/SPAN&gt;msglvl=1 and show the output?&lt;/P&gt;

&lt;P style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;MKL provide some benchmark about pardiso, &lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/intel-mkl/benchmarks#Parallell" target="_blank"&gt;https://software.intel.com/en-us/intel-mkl/benchmarks#Parallell&lt;/A&gt;. &amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;3) Additionally, do you have the spare matrix stored in file? If yes, you can try pardiso.f &amp;nbsp;under &amp;nbsp;MKL install directory, i.e&amp;nbsp;&lt;/P&gt;

&lt;P&gt;C:\Program Files (x86)\IntelSWTools\compilers_and_libraries_2016.1.146\windows\mkl\examples&amp;nbsp;directly.&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2016 02:11:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073533#M22407</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2016-02-16T02:11:25Z</dc:date>
    </item>
    <item>
      <title>Hi, thanks for reply.</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073534#M22408</link>
      <description>&lt;P&gt;Hi, thanks for reply.&lt;/P&gt;

&lt;P&gt;1a) MKL 11.3, Win 8.1, 64bit, C.&lt;/P&gt;

&lt;P&gt;1b) I found this parameter yesterday, it improves the CPU time by 10%, everything is stored in RAM.&lt;/P&gt;

&lt;P&gt;2) The matrix comes from FEM, it is not symmetric and not positive definite (matrix type 11). When it comes from 2D problem, the connectivity is not very large and we solve problems between 1mil and 8mil, in 3D (higher connectivity, less sparse than in 2D) we solve problems between 200k and 1mil. In all cases solution on one CPU is less than 10% faster than on both CPUs. As an example: size of the matrix 4 327 200, number of non-zero entries is 116 704 670. Now, I can not set&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;msglvl=1, what information would it provide to me?&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;3) Matrix is built in RAM, I will try to do the benchmark.&lt;/P&gt;

&lt;P&gt;The main reason why I am solving this problem is that the computer with 2x Xeon E5-2680 v3 was much more expensive than the computer with overclocked 5960X and the difference in CPU time between them is only around 20%. So I would like to know if there is some reason to buy such a computer next time. Btw. is RAM with ECC a big advantage?&lt;/P&gt;

&lt;P&gt;Thanks, Karel.&lt;/P&gt;</description>
      <pubDate>Tue, 16 Feb 2016 09:16:04 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073534#M22408</guid>
      <dc:creator>Karel_T_</dc:creator>
      <dc:date>2016-02-16T09:16:04Z</dc:date>
    </item>
    <item>
      <title>Hi Karel, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073535#M22409</link>
      <description>&lt;P&gt;Hi Karel,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Thanks for the reply.&amp;nbsp;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;As i understand, by default, it is not expected that "&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;&amp;nbsp;24threads is only 10% faster than with 12threads on one processor", but it depends on the solver size, sparsity and cpu memory size etc. &amp;nbsp;So we need to test case to verify.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;If &lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;set&amp;nbsp;msglvl=1&lt;/SPAN&gt;, you will see the solver's message as &amp;nbsp;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;&lt;A href="https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/601183" target="_blank"&gt;https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/601183&lt;/A&gt; attached. or&amp;nbsp;Is there any way for provide us one input (write one into txt file) , so we can did standalone test at our sides?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;We release MKL 11.3.2 this week, we had gotten one performance issue in MKL 11.3 and 11.3.1 , please see&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&lt;A href="https://software.intel.com/en-us/articles/intel-mkl-113-bug-fixes-list" target="_blank"&gt;https://software.intel.com/en-us/articles/intel-mkl-113-bug-fixes-list&lt;/A&gt;. would you please try the version and show the performance &lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;comparsion&lt;/SPAN&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;You can get MKL 11.3.2 by intel registration center, &amp;nbsp;&lt;/SPAN&gt;&lt;A href="https://registrationcenter.intel.com/en/" target="_blank"&gt;https://registrationcenter.intel.com/en/&lt;/A&gt;&lt;/P&gt;

&lt;P&gt;Best Regards,&lt;/P&gt;

&lt;P&gt;Ying&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 02:05:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073535#M22409</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2016-02-22T02:05:54Z</dc:date>
    </item>
    <item>
      <title>Did you investigate whether</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073536#M22410</link>
      <description>&lt;P&gt;Did you investigate whether setting affinity e.g. OMP_PROC_BIND or OMP_PLACES will improve dual CPU performance?&lt;/P&gt;</description>
      <pubDate>Mon, 22 Feb 2016 03:38:38 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Pardiso-24-cores-poor-scaling/m-p/1073536#M22410</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2016-02-22T03:38:38Z</dc:date>
    </item>
  </channel>
</rss>

