<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to setup environment to let PARDISO use all CPU resource? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961083#M15910</link>
    <description>&lt;P&gt;Dear all,&lt;/P&gt;
&lt;P&gt;I am trying to use PARDISO in machines with 2 X5675&lt;A href="#"&gt;&lt;IMG src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIGNIUk0AAHolAACAgwAA+f8AAIDpAAB1MAAA6mAAADqYAAAXb5JfxUYAAAKLSURBVHjadJPfS5NhFMe/21xvuhXRyJAZroiSrJnbRdT7vrAf5HBaK5RABmEEwQIvkpZ/QRcWXdSFw5soKaF0F7qZeLO13mGBDpQsf5CoxVKHOt0Pctp2uvEdrzG/V+c553w/54HnPDIiQiGpPMETABoB2AAYd9MRAMMAvGmX+RcAyAoBVJ7gZQDtABworH4AHWmX+bOMZdkjCoXiUzabvcAwzPSsob5p/VTNY9GcdpnxdmYZ9wJThSCtCr1e/4XjuNPd3d1KjUZzaGbI27ysqzGQoggAsLa1A7ehArrDxfDNr0oBlQB+wmKxbJFEL968SxoamsjkHaPU9l9piUo6A0RE1DG2QCWdASrpDAzJM5kMI8XecdjVxfEl+K9dxFgsgUvvR6HyBKHyBAEATyKLeGSsENuNcqk5kUjEGm7fzcYqr0ClVODl99+YXEvl6+c1amjVe+ahiGGYaUEQKnmeh91uL43rqheixjpdmzCL11er0PcjhrTLvMfUJsyKYUSeyWQ6enp6tgCgrKxsfbP8bB8AdE1G89cOReMAgOv+Cag8QXRNRkXAsDwcDr+am5tLCYKA3t7eo2dG+1vVK/MfpRPtA+MIReMYaKj+/xm9MiICx3EmpVL5wefzFavValis1u1vvHMkdfykCQC0kSGUTo+Ajmnx1dSC7IGD+UUCEYGIwLKsyWazrSeTSSIiMpnNf7Ttz5+ec96fr7/VnE0mk+QfHMzV3WjcKH/4rEr05QGFIA6HY4llWRLPRER+v3/HYrFMFQSIkNra2tVQKJSlfcSyLO0LECFWq3XF6XRGA4HAptTsdrsXeZ6fEHtl+31nAOA4rkUulz/I5XL63dQGgHEAN8Ph8AYA/BsAt4ube4GblQIAAAAASUVORK5CYII=" /&gt;&lt;/A&gt;&lt;A href="#"&gt;&lt;IMG src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIGNIUk0AAHolAACAgwAA+f8AAIDpAAB1MAAA6mAAADqYAAAXb5JfxUYAAAKLSURBVHjadJPfS5NhFMe/21xvuhXRyJAZroiSrJnbRdT7vrAf5HBaK5RABmEEwQIvkpZ/QRcWXdSFw5soKaF0F7qZeLO13mGBDpQsf5CoxVKHOt0Pctp2uvEdrzG/V+c553w/54HnPDIiQiGpPMETABoB2AAYd9MRAMMAvGmX+RcAyAoBVJ7gZQDtABworH4AHWmX+bOMZdkjCoXiUzabvcAwzPSsob5p/VTNY9GcdpnxdmYZ9wJThSCtCr1e/4XjuNPd3d1KjUZzaGbI27ysqzGQoggAsLa1A7ehArrDxfDNr0oBlQB+wmKxbJFEL968SxoamsjkHaPU9l9piUo6A0RE1DG2QCWdASrpDAzJM5kMI8XecdjVxfEl+K9dxFgsgUvvR6HyBKHyBAEATyKLeGSsENuNcqk5kUjEGm7fzcYqr0ClVODl99+YXEvl6+c1amjVe+ahiGGYaUEQKnmeh91uL43rqheixjpdmzCL11er0PcjhrTLvMfUJsyKYUSeyWQ6enp6tgCgrKxsfbP8bB8AdE1G89cOReMAgOv+Cag8QXRNRkXAsDwcDr+am5tLCYKA3t7eo2dG+1vVK/MfpRPtA+MIReMYaKj+/xm9MiICx3EmpVL5wefzFavValis1u1vvHMkdfykCQC0kSGUTo+Ajmnx1dSC7IGD+UUCEYGIwLKsyWazrSeTSSIiMpnNf7Ttz5+ec96fr7/VnE0mk+QfHMzV3WjcKH/4rEr05QGFIA6HY4llWRLPRER+v3/HYrFMFQSIkNra2tVQKJSlfcSyLO0LECFWq3XF6XRGA4HAptTsdrsXeZ6fEHtl+31nAOA4rkUulz/I5XL63dQGgHEAN8Ph8AYA/BsAt4ube4GblQIAAAAASUVORK5CYII=" /&gt;&lt;/A&gt; CPUs. I have enabled hyperthread in CMOS so I can see 24 cores in windows task manager. I run PARDISO with parallel paramerter settings:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; iparm(1) = 1 ! no solver default&lt;BR /&gt;&amp;nbsp; !iparm(2) = 2 ! fill-in reordering from METIS&lt;BR /&gt;&amp;nbsp; iparm(2) = 3 ! parallel (OpenMP) version of the nested dissection algorithm&lt;BR /&gt;&amp;nbsp; iparm(4) = 0 ! no iterative-direct algorithm&lt;BR /&gt;&amp;nbsp; iparm(5) = 0 ! no user fill-in reducing permutation&lt;BR /&gt;&amp;nbsp; iparm(6) = 0 ! =0 solution on the first n compoments of x&lt;BR /&gt;&amp;nbsp; iparm(8) = 9 ! numbers of iterative refinement steps&lt;BR /&gt;&amp;nbsp; iparm(10) = 13 ! perturbe the pivot elements with 1E-13&lt;BR /&gt;&amp;nbsp; iparm(11) = 1 ! use nonsymmetric permutation and scaling MPS&lt;BR /&gt;&amp;nbsp; iparm(13) = 0 ! maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm(13) = 1 in case of inappropriate accuracy&lt;BR /&gt;&amp;nbsp; iparm(14) = 0 ! Output: number of perturbed pivots&lt;BR /&gt;&amp;nbsp; iparm(18) = -1 ! Output: number of nonzeros in the factor LU&lt;BR /&gt;&amp;nbsp; iparm(19) = -1 ! Output: Mflops for LU factorization&lt;BR /&gt;&amp;nbsp; iparm(20) = 0 ! Output: Numbers of CG Iterations&lt;BR /&gt;&amp;nbsp; iparm(21) = 1 ! Apply 1x1 and 2x2 Bunch and Kaufman pivoting during the factorization process&lt;BR /&gt;&amp;nbsp; iparm(24) = 1 ! PARDISO uses new two-level factorization algorithm&lt;/P&gt;
&lt;P&gt;In addition, I have set the following environment variables:&lt;/P&gt;
&lt;P&gt;MKL_NUM_THREADS&amp;nbsp;&amp;nbsp; 24&lt;BR /&gt;MKL_DOMAIN_ALL&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 24 &lt;BR /&gt;OMP_NUM_THREADS&amp;nbsp; 24&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, when running the program with one process, the CPU usage is 50% when solving, when running with two processes, the CPU usage of every process is 25% when solving, and so on. The final result is the total CPU usage is only 50%.&lt;/P&gt;
&lt;P&gt;What have I missed to do?&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Zhanghong Tang&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
    <pubDate>Mon, 04 Mar 2013 23:54:47 GMT</pubDate>
    <dc:creator>Zhanghong_T_</dc:creator>
    <dc:date>2013-03-04T23:54:47Z</dc:date>
    <item>
      <title>How to setup environment to let PARDISO use all CPU resource?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961083#M15910</link>
      <description>&lt;P&gt;Dear all,&lt;/P&gt;
&lt;P&gt;I am trying to use PARDISO in machines with 2 X5675&lt;A href="#"&gt;&lt;IMG src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIGNIUk0AAHolAACAgwAA+f8AAIDpAAB1MAAA6mAAADqYAAAXb5JfxUYAAAKLSURBVHjadJPfS5NhFMe/21xvuhXRyJAZroiSrJnbRdT7vrAf5HBaK5RABmEEwQIvkpZ/QRcWXdSFw5soKaF0F7qZeLO13mGBDpQsf5CoxVKHOt0Pctp2uvEdrzG/V+c553w/54HnPDIiQiGpPMETABoB2AAYd9MRAMMAvGmX+RcAyAoBVJ7gZQDtABworH4AHWmX+bOMZdkjCoXiUzabvcAwzPSsob5p/VTNY9GcdpnxdmYZ9wJThSCtCr1e/4XjuNPd3d1KjUZzaGbI27ysqzGQoggAsLa1A7ehArrDxfDNr0oBlQB+wmKxbJFEL968SxoamsjkHaPU9l9piUo6A0RE1DG2QCWdASrpDAzJM5kMI8XecdjVxfEl+K9dxFgsgUvvR6HyBKHyBAEATyKLeGSsENuNcqk5kUjEGm7fzcYqr0ClVODl99+YXEvl6+c1amjVe+ahiGGYaUEQKnmeh91uL43rqheixjpdmzCL11er0PcjhrTLvMfUJsyKYUSeyWQ6enp6tgCgrKxsfbP8bB8AdE1G89cOReMAgOv+Cag8QXRNRkXAsDwcDr+am5tLCYKA3t7eo2dG+1vVK/MfpRPtA+MIReMYaKj+/xm9MiICx3EmpVL5wefzFavValis1u1vvHMkdfykCQC0kSGUTo+Ajmnx1dSC7IGD+UUCEYGIwLKsyWazrSeTSSIiMpnNf7Ttz5+ec96fr7/VnE0mk+QfHMzV3WjcKH/4rEr05QGFIA6HY4llWRLPRER+v3/HYrFMFQSIkNra2tVQKJSlfcSyLO0LECFWq3XF6XRGA4HAptTsdrsXeZ6fEHtl+31nAOA4rkUulz/I5XL63dQGgHEAN8Ph8AYA/BsAt4ube4GblQIAAAAASUVORK5CYII=" /&gt;&lt;/A&gt;&lt;A href="#"&gt;&lt;IMG src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAIGNIUk0AAHolAACAgwAA+f8AAIDpAAB1MAAA6mAAADqYAAAXb5JfxUYAAAKLSURBVHjadJPfS5NhFMe/21xvuhXRyJAZroiSrJnbRdT7vrAf5HBaK5RABmEEwQIvkpZ/QRcWXdSFw5soKaF0F7qZeLO13mGBDpQsf5CoxVKHOt0Pctp2uvEdrzG/V+c553w/54HnPDIiQiGpPMETABoB2AAYd9MRAMMAvGmX+RcAyAoBVJ7gZQDtABworH4AHWmX+bOMZdkjCoXiUzabvcAwzPSsob5p/VTNY9GcdpnxdmYZ9wJThSCtCr1e/4XjuNPd3d1KjUZzaGbI27ysqzGQoggAsLa1A7ehArrDxfDNr0oBlQB+wmKxbJFEL968SxoamsjkHaPU9l9piUo6A0RE1DG2QCWdASrpDAzJM5kMI8XecdjVxfEl+K9dxFgsgUvvR6HyBKHyBAEATyKLeGSsENuNcqk5kUjEGm7fzcYqr0ClVODl99+YXEvl6+c1amjVe+ahiGGYaUEQKnmeh91uL43rqheixjpdmzCL11er0PcjhrTLvMfUJsyKYUSeyWQ6enp6tgCgrKxsfbP8bB8AdE1G89cOReMAgOv+Cag8QXRNRkXAsDwcDr+am5tLCYKA3t7eo2dG+1vVK/MfpRPtA+MIReMYaKj+/xm9MiICx3EmpVL5wefzFavValis1u1vvHMkdfykCQC0kSGUTo+Ajmnx1dSC7IGD+UUCEYGIwLKsyWazrSeTSSIiMpnNf7Ttz5+ec96fr7/VnE0mk+QfHMzV3WjcKH/4rEr05QGFIA6HY4llWRLPRER+v3/HYrFMFQSIkNra2tVQKJSlfcSyLO0LECFWq3XF6XRGA4HAptTsdrsXeZ6fEHtl+31nAOA4rkUulz/I5XL63dQGgHEAN8Ph8AYA/BsAt4ube4GblQIAAAAASUVORK5CYII=" /&gt;&lt;/A&gt; CPUs. I have enabled hyperthread in CMOS so I can see 24 cores in windows task manager. I run PARDISO with parallel paramerter settings:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&amp;nbsp; iparm(1) = 1 ! no solver default&lt;BR /&gt;&amp;nbsp; !iparm(2) = 2 ! fill-in reordering from METIS&lt;BR /&gt;&amp;nbsp; iparm(2) = 3 ! parallel (OpenMP) version of the nested dissection algorithm&lt;BR /&gt;&amp;nbsp; iparm(4) = 0 ! no iterative-direct algorithm&lt;BR /&gt;&amp;nbsp; iparm(5) = 0 ! no user fill-in reducing permutation&lt;BR /&gt;&amp;nbsp; iparm(6) = 0 ! =0 solution on the first n compoments of x&lt;BR /&gt;&amp;nbsp; iparm(8) = 9 ! numbers of iterative refinement steps&lt;BR /&gt;&amp;nbsp; iparm(10) = 13 ! perturbe the pivot elements with 1E-13&lt;BR /&gt;&amp;nbsp; iparm(11) = 1 ! use nonsymmetric permutation and scaling MPS&lt;BR /&gt;&amp;nbsp; iparm(13) = 0 ! maximum weighted matching algorithm is switched-off (default for symmetric). Try iparm(13) = 1 in case of inappropriate accuracy&lt;BR /&gt;&amp;nbsp; iparm(14) = 0 ! Output: number of perturbed pivots&lt;BR /&gt;&amp;nbsp; iparm(18) = -1 ! Output: number of nonzeros in the factor LU&lt;BR /&gt;&amp;nbsp; iparm(19) = -1 ! Output: Mflops for LU factorization&lt;BR /&gt;&amp;nbsp; iparm(20) = 0 ! Output: Numbers of CG Iterations&lt;BR /&gt;&amp;nbsp; iparm(21) = 1 ! Apply 1x1 and 2x2 Bunch and Kaufman pivoting during the factorization process&lt;BR /&gt;&amp;nbsp; iparm(24) = 1 ! PARDISO uses new two-level factorization algorithm&lt;/P&gt;
&lt;P&gt;In addition, I have set the following environment variables:&lt;/P&gt;
&lt;P&gt;MKL_NUM_THREADS&amp;nbsp;&amp;nbsp; 24&lt;BR /&gt;MKL_DOMAIN_ALL&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 24 &lt;BR /&gt;OMP_NUM_THREADS&amp;nbsp; 24&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, when running the program with one process, the CPU usage is 50% when solving, when running with two processes, the CPU usage of every process is 25% when solving, and so on. The final result is the total CPU usage is only 50%.&lt;/P&gt;
&lt;P&gt;What have I missed to do?&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Zhanghong Tang&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Mon, 04 Mar 2013 23:54:47 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961083#M15910</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2013-03-04T23:54:47Z</dc:date>
    </item>
    <item>
      <title>Zhanghong,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961084#M15911</link>
      <description>&lt;P&gt;Zhanghong,&lt;/P&gt;
&lt;P&gt;For hyperthreading system, it is recommended to set the application threading number to half of the total physical threadings &lt;BR /&gt;You can check a few of the detail on this post:&lt;BR /&gt;&lt;A href="http://software.intel.com/en-us/forums/topic/294954"&gt;http://software.intel.com/en-us/forums/topic/294954&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;You can use the following setting to enforce more threading:&lt;BR /&gt;MKL_DYNAMIC=FALSE&lt;BR /&gt;MKL_NUM_THREADS= number of the threadings.&lt;/P&gt;
&lt;P&gt;but it may not increase the application performance.&lt;/P&gt;
&lt;P&gt;Regards,&lt;BR /&gt;Chao&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2013 02:04:49 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961084#M15911</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2013-03-05T02:04:49Z</dc:date>
    </item>
    <item>
      <title>Hi Chao,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961085#M15912</link>
      <description>&lt;P&gt;Hi Chao,&lt;/P&gt;
&lt;P&gt;Thank you very much for your kindly reply. So you recommend that I set as follows?&lt;/P&gt;
&lt;P&gt;MKL_NUM_THREADS&amp;nbsp;&amp;nbsp; 12&lt;BR /&gt;MKL_DOMAIN_ALL&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; 12&lt;BR /&gt;OMP_NUM_THREADS&amp;nbsp; 12&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Zhanghong Tang&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2013 03:20:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961085#M15912</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2013-03-05T03:20:26Z</dc:date>
    </item>
    <item>
      <title>Zhanghong,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961086#M15913</link>
      <description>&lt;P&gt;Zhanghong,&lt;/P&gt;
&lt;P&gt;yes, it is recommend to set as 12 threadings.&amp;nbsp; MKL_NUM_THREADS/OMP_NUM_THREADS/MKL_DOMAIN_NUM_THREADS, you only need to choose one of them.&lt;/P&gt;
&lt;P&gt;Thanks,&lt;BR /&gt;Chao&lt;/P&gt;</description>
      <pubDate>Tue, 12 Mar 2013 07:33:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961086#M15913</guid>
      <dc:creator>Chao_Y_Intel</dc:creator>
      <dc:date>2013-03-12T07:33:11Z</dc:date>
    </item>
    <item>
      <title>Hi Chao,</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961087#M15914</link>
      <description>&lt;P&gt;Hi Chao,&lt;/P&gt;
&lt;P&gt;Thank you very much for your kindly suggestion and explaination.&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Zhanghong Tang&lt;/P&gt;</description>
      <pubDate>Wed, 13 Mar 2013 05:10:15 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/How-to-setup-environment-to-let-PARDISO-use-all-CPU-resource/m-p/961087#M15914</guid>
      <dc:creator>Zhanghong_T_</dc:creator>
      <dc:date>2013-03-13T05:10:15Z</dc:date>
    </item>
  </channel>
</rss>

