<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic pardiso PROCESSORS in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878056#M9219</link>
    <description>&lt;P&gt;Hello sir,&lt;/P&gt;
&lt;P&gt;I haveused&lt;/P&gt;
&lt;B&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;
&lt;P&gt;call &lt;B&gt;mkl_set_num_threads(16)&lt;/B&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;but still i get a statement inthe result as&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;parallel direct factorization with processors: &amp;gt; 8&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;the arrays a, b,ia,ja,x are already dynamically allocated in the beginning of the code.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/B&gt;</description>
    <pubDate>Fri, 05 Mar 2010 10:08:40 GMT</pubDate>
    <dc:creator>ahmediiit</dc:creator>
    <dc:date>2010-03-05T10:08:40Z</dc:date>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878050#M9213</link>
      <description>&lt;P&gt;Hello sir,&lt;/P&gt;
&lt;P&gt;I am getting the following results&lt;/P&gt;
&lt;P&gt;when compiling In debug win32 mode&lt;/P&gt;
&lt;P&gt;number1=omp_get_max_threads()&lt;/P&gt;
&lt;P&gt;call mkl_set_num_threads( number2 )&lt;/P&gt;
&lt;P&gt;result&lt;/P&gt;
&lt;P&gt;number1=2400(or some other number)&lt;/P&gt;
&lt;P&gt;number2=16&lt;/P&gt;
&lt;P&gt;when compiling In release x64 mode&lt;/P&gt;
&lt;P&gt;number1=omp_get_max_threads()&lt;/P&gt;
&lt;P&gt;call mkl_set_num_threads( number2 )&lt;/P&gt;
&lt;P&gt;result&lt;/P&gt;
&lt;P&gt;number1=0&lt;/P&gt;
&lt;P&gt;number2=16&lt;/P&gt;
&lt;P&gt;I need to set the stack reserve size =21285000&lt;/P&gt;
&lt;P&gt;other wise i am getting error as stack over flow.&lt;/P&gt;
&lt;P&gt;In debug mode i am getting error for large problems.&lt;/P&gt;
&lt;P&gt;please help to make the code run faster so it uses all the 16 processors.&lt;/P&gt;
&lt;P&gt;When i start the code i can see the 100% cpu usage(due to parallelization) and When it enters into pardiso subroutine it shows 50 % cpu usage&lt;/P&gt;
&lt;P&gt;Please help so that all the 16 processors are working in pardiso subroutine.&lt;/P&gt;</description>
      <pubDate>Thu, 04 Mar 2010 10:06:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878050#M9213</guid>
      <dc:creator>ahmediiit</dc:creator>
      <dc:date>2010-03-04T10:06:52Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878051#M9214</link>
      <description>&lt;P&gt;Hello Ahmed,&lt;/P&gt;
&lt;P&gt;actually, we recommend to use mkl_get_max_threads()instead of mp_get_max_threads() you used, because of&lt;/P&gt;
&lt;P&gt;1)Intel MKL threading controls take precedence over the OpenMP techniques and&lt;/P&gt;
&lt;P&gt;2) you don't need to include omp header file #include  in your application.&lt;/P&gt;
&lt;P&gt;but in any case if all things will done by properly way the results should be the same.&lt;/P&gt;
&lt;P&gt;For example: please try to do something like the code below and see what you will have:&lt;/P&gt;
&lt;P&gt;
&lt;/P&gt;&lt;DIV id="_mcePaste"&gt;
&lt;DIV id="_mcePaste"&gt;#include "mkl.h"&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;#include &lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;int main( void ){&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;int number_omp =omp_get_max_threads();&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;printf("\n\t number_omp == %d \n", number_omp);&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;int number_mkl = mkl_get_max_threads();&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;printf("\n\t number_mkl == %d \n", number_mkl);&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;return 0;&lt;/DIV&gt;
&lt;DIV id="_mcePaste"&gt;}&lt;/DIV&gt;
&lt;/DIV&gt;
&lt;DIV&gt;i have on my side ( 2-core system):&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;number_omp == 2&lt;/DIV&gt;
&lt;DIV&gt;number_mkl == 2&lt;/DIV&gt;
&lt;DIV&gt;Press any key to continue . . .&lt;/DIV&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;--Gennady&lt;/P&gt;</description>
      <pubDate>Thu, 04 Mar 2010 14:20:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878051#M9214</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-03-04T14:20:32Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878052#M9215</link>
      <description>&lt;P&gt;Ahmed,&lt;/P&gt;
&lt;P&gt;You don't need to set explicitly the stack size like you did in this case.&lt;/P&gt;
&lt;P&gt;What is your task size? I mean the number of equations, nnz? How did you allocate the working arrays ( a, ja, ia)?&lt;/P&gt;
&lt;P&gt;--Gennady&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 04 Mar 2010 14:29:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878052#M9215</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-03-04T14:29:39Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878053#M9216</link>
      <description>&lt;P&gt;My system is xeon E5520 at 2.27 ghz withocta processors,24 Gb RAM&lt;/P&gt;
&lt;P&gt;64 Bit os&lt;/P&gt;
&lt;P&gt;when i check the processors on the device manager i can find16 processors.&lt;/P&gt;
&lt;P&gt;The arrays a,ja,ia are allocatable arrays intialized in the begining of the code.&lt;/P&gt;
&lt;P&gt;Presently i am solving about 600000 equation with 150,00000 nonzeros.&lt;/P&gt;
&lt;P&gt;Still i need to increase the number of equations.&lt;/P&gt;
&lt;P&gt;it is taking 5 minutes for each iteration which i have to do it for many times&lt;/P&gt;
&lt;P&gt;While the program is running i can see only 50 % of cpu usage&lt;/P&gt;
&lt;P&gt;with only 8 slots running in the task manager.&lt;/P&gt;
&lt;P&gt;How to make it 100%.&lt;/P&gt;
&lt;P&gt;If i dont set the stack size i am getting the stack overflow error.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;***********************subroutine used *******************&lt;BR /&gt;******************************pardiso subroutine*******************************&lt;BR /&gt;&lt;BR /&gt;subroutine mklpardiso(a,ja,ia,b,nc,n)&lt;BR /&gt;IMPLICIT NONE&lt;BR /&gt;include 'mkl_pardiso.f77'&lt;BR /&gt;INTEGER*8 pt(64)&lt;BR /&gt;C.. All other variables&lt;BR /&gt;INTEGER maxfct, mnum,nc,mtype, phase, n, nrhs, error, msglvl&lt;BR /&gt;INTEGER iparm(64)&lt;BR /&gt;INTEGER ia(n+1)&lt;BR /&gt;INTEGER ja(nc)&lt;BR /&gt;REAL*8 a(nc)&lt;BR /&gt;REAL*8 b(n)&lt;BR /&gt;REAL*8 x(n)&lt;BR /&gt;INTEGER i, idum&lt;BR /&gt;REAL*8 waltime1, waltime2, ddum&lt;BR /&gt;C.. Fill all arrays containing matrix data.&lt;BR /&gt;DATA nrhs /1/, maxfct /1/, mnum /1/&lt;BR /&gt;&lt;BR /&gt;do i = 1, 64&lt;BR /&gt;iparm(i) = 0&lt;BR /&gt;end do&lt;BR /&gt;iparm(1) = 1 ! no solver default&lt;BR /&gt;iparm(2) = 3 ! fill-in reordering from METIS openmp=3&lt;BR /&gt;iparm(3) = 16 ! numbers of processors&lt;BR /&gt;iparm(4) = 0 ! no iterative-direct algorithm&lt;BR /&gt;iparm(5) = 0 ! no user fill-in reducing permutation&lt;BR /&gt;iparm(6) = 0 ! =0 solution on the first n compoments of x&lt;BR /&gt;iparm(7) = 0 ! not in use&lt;BR /&gt;iparm(8) = 9 ! numbers of iterative refinement steps&lt;BR /&gt;iparm(9) = 0 ! not in use&lt;BR /&gt;iparm(10) = 13 ! perturbe the pivot elements with 1E-13&lt;BR /&gt;iparm(11) = 1 ! use nonsymmetric permutation and scaling MPS&lt;BR /&gt;iparm(12) = 0 ! not in use&lt;BR /&gt;iparm(13) = 0 ! maximum weighted matching algorithm is &lt;BR /&gt;iparm(14) = 0 ! Output: number of perturbed pivots&lt;BR /&gt;iparm(15) = 0 ! not in use&lt;BR /&gt;iparm(16) = 0 ! not in use&lt;BR /&gt;iparm(17) = 0 ! not in use&lt;BR /&gt;iparm(18) = -1 ! Output: number of nonzeros in the factor LU&lt;BR /&gt;iparm(19) = -1 ! Output: Mflops for LU factorization&lt;BR /&gt;iparm(20) = 0 ! Output: Numbers of CG Iterations&lt;BR /&gt;iparm(60) =0&lt;BR /&gt;error = 0 ! initialize error flag&lt;BR /&gt;msglvl = 1 ! print statistical information&lt;BR /&gt;mtype = 2 ! symmetric, indefinite&lt;BR /&gt;&lt;BR /&gt;phase = 11 ! only reordering and symbolic factorization&lt;BR /&gt;CALL pardiso (pt, maxfct, mnum, mtype, phase, n, a, ia, ja,&lt;BR /&gt;1 idum, nrhs, iparm, msglvl, ddum, ddum, error)&lt;BR /&gt;.&lt;BR /&gt;phase = 22 ! only factorization&lt;BR /&gt;CALL pardiso (pt, maxfct, mnum, mtype, phase, n, a, ia, ja,&lt;BR /&gt;1 idum, nrhs, iparm, msglvl, ddum, ddum, error)&lt;BR /&gt;&lt;BR /&gt;iparm(8) = 2 ! max numbers of iterative refinement steps&lt;BR /&gt;phase = 33 ! only factorization&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;CALL pardiso (pt, maxfct, mnum, mtype, phase, n, a, ia, ja,&lt;BR /&gt;1 idum, nrhs, iparm, msglvl, b, x, error)&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;b=x&lt;BR /&gt;&lt;BR /&gt;phase = -1 ! release internal memory&lt;BR /&gt;&lt;BR /&gt;CALL pardiso (pt, maxfct, mnum, mtype, phase, n, ddum, idum, idum,&lt;BR /&gt;1 idum, nrhs, iparm, msglvl, ddum, ddum, error)&lt;BR /&gt;&lt;BR /&gt;return&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;END&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Mar 2010 03:11:05 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878053#M9216</guid>
      <dc:creator>ahmediiit</dc:creator>
      <dc:date>2010-03-05T03:11:05Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878054#M9217</link>
      <description>&lt;P&gt;Ahmed,&lt;/P&gt;
&lt;P&gt;Your system has 16 logical processors, but only 8 physical cores due to Hyper-Threading. So, MKL decides that it's more optimal to run the code with 8 threads, not 16. I think your program already works in optimal conditions.&lt;/P&gt;
&lt;P&gt;However, if you would like to set exactly 16 threads to compareperformance, please set envinronment variable MKL_NUM_THREADS=16, or call mkl_set_num_threads(16) into your program.&lt;/P&gt;
&lt;P&gt;Please, note: iparm(3) is not used in current version of MKL PARDISO for setting a number of threads.&lt;/P&gt;
&lt;P&gt;Best regards,&lt;/P&gt;
&lt;P&gt;Konstantin&lt;/P&gt;
&lt;SPAN&gt;&lt;SPAN&gt;&lt;FONT face="Courier" size="2"&gt;&lt;FONT face="Courier" size="2"&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/FONT&gt;&lt;/FONT&gt;&lt;/SPAN&gt;&lt;FONT face="Courier" size="2"&gt;
&lt;P&gt;&lt;/P&gt;
&lt;/FONT&gt;&lt;/SPAN&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Mar 2010 07:06:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878054#M9217</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-03-05T07:06:14Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878055#M9218</link>
      <description>&lt;P&gt;Ahmed,&lt;/P&gt;
&lt;P&gt;The task that you are solving is pretty big (Quote " Presently i am solving about 600000 equation with 150,00000 nonzeros.") therefore to use static allocation is not good idea.&lt;/P&gt;
&lt;P&gt;Could you please try to allocate all working arrays by dynamically instead of static:&lt;/P&gt;
&lt;P&gt;ALLOCATE( ja( nnonzeros ), a( nnonzeros ), b( n ), ia( n + 1 ), x( n ), r( n ))&lt;/P&gt;
&lt;P&gt;--Gennady&lt;/P&gt;
&lt;P&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Mar 2010 08:11:44 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878055#M9218</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-03-05T08:11:44Z</dc:date>
    </item>
    <item>
      <title>pardiso PROCESSORS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878056#M9219</link>
      <description>&lt;P&gt;Hello sir,&lt;/P&gt;
&lt;P&gt;I haveused&lt;/P&gt;
&lt;B&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;&lt;SPAN&gt;
&lt;P&gt;call &lt;B&gt;mkl_set_num_threads(16)&lt;/B&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;but still i get a statement inthe result as&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;parallel direct factorization with processors: &amp;gt; 8&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;the arrays a, b,ia,ja,x are already dynamically allocated in the beginning of the code.&lt;/SPAN&gt;&lt;/P&gt;
&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;/SPAN&gt;&lt;SPAN&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/B&gt;</description>
      <pubDate>Fri, 05 Mar 2010 10:08:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/pardiso-PROCESSORS/m-p/878056#M9219</guid>
      <dc:creator>ahmediiit</dc:creator>
      <dc:date>2010-03-05T10:08:40Z</dc:date>
    </item>
  </channel>
</rss>

