<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Qustion about multiple RHS in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837313#M6176</link>
    <description>&amp;gt;&amp;gt; That is, becauses we have to allocate and free memory every solve when 1-RHS?&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Not exactly. When we solve 1-RHS, we consider RHS as vector, and use matrix-vector (MV) operations to compute forward and backward substitutions (solve phase). In case of N-RHS, we operate with RHS as with matrix and thus use matrix-matrix (MM) operations.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;MM operations are more efficient than MV operations theoretically (and practically) in modern computers as far as memory operations are usually more expensive (time-consuming) than float-point operations (each core has the same number of fp-units, but memory bandwidth is often limited/shared between cores, they also share caches and so on). MM product consists of ~N^2 memory ops, and ~N^3 fp ops. MV has respectively ~N^2 and ~N^2. So, we can conclude that MM product not depends so much on memory operation (if computations are implemented in optimal way as it's done in MKL). But MV product is limited by memory bandwidth much more.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
    <pubDate>Wed, 25 Aug 2010 06:41:50 GMT</pubDate>
    <dc:creator>Konstantin_A_Intel</dc:creator>
    <dc:date>2010-08-25T06:41:50Z</dc:date>
    <item>
      <title>Question about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837288#M6151</link>
      <description>Dear All.&lt;BR /&gt;&lt;BR /&gt;Itook the implementation of PARDISO mrhs version &lt;BR /&gt;into our application.&lt;BR /&gt;I followed Kalinkin's instruction. &lt;BR /&gt;The solver was operated normaly and resultswere correct.&lt;BR /&gt;But problem issolving time.&lt;BR /&gt;In phase=33, with single rhs case&lt;BR /&gt;the solve time was 0.03450 s.&lt;BR /&gt;But, with 4-rhs case&lt;BR /&gt;the solve time was 0.532629 s.&lt;BR /&gt;I expected the almost same run-time,&lt;BR /&gt;however 4-rhs case was 15 times slower than&lt;BR /&gt;single rhs.&lt;BR /&gt;I heard that we canreduce run-time using multiple rhs.&lt;BR /&gt;I tested some other test cases, but results were similar.&lt;BR /&gt;&lt;BR /&gt;What are the overhead factors influenced such results?&lt;BR /&gt;Is it right thatthe PARDISO provides the multi threading in phase=33 solve part?&lt;BR /&gt;</description>
      <pubDate>Wed, 18 Aug 2010 06:20:18 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837288#M6151</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-18T06:20:18Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837289#M6152</link>
      <description>Hi Bosun,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;May you provide a bit more details about your configuration?&lt;/DIV&gt;&lt;DIV&gt;Namely, what is your MKL version, OS and processor? How many cores is in your system? It's also good to know the number of equations in your task.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;With this information we will be able to reproduce the situation and provide you with appropriate advise.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thanks,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
      <pubDate>Wed, 18 Aug 2010 06:50:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837289#M6152</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-18T06:50:09Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837290#M6153</link>
      <description>&lt;P&gt;Dear Konstantin.&lt;BR /&gt;&lt;BR /&gt;MKL version is the latest version and&lt;BR /&gt;O/S is Redhat_AS4_U7, &lt;BR /&gt;CPU is Intel Xeon 3GHz 4-core&lt;BR /&gt;Memory is 64Gb&lt;BR /&gt;&lt;BR /&gt;Thank you.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang&lt;/P&gt;</description>
      <pubDate>Wed, 18 Aug 2010 07:08:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837290#M6153</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-18T07:08:09Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837291#M6154</link>
      <description>Ok, thank you! &lt;SPAN style="font-family: verdana, sans-serif;"&gt;And what is the number of equations? (roughly.. 100, 1000, 10000)&lt;/SPAN&gt;&lt;DIV&gt;&lt;SPAN style="font-family: verdana, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: verdana, sans-serif;"&gt;I'll make a couple of runs of similar task and will update you.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: verdana, sans-serif;"&gt;&lt;BR /&gt;&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: verdana, sans-serif;"&gt;Regards,&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-family: verdana, sans-serif;"&gt;Konstantin&lt;/SPAN&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 18 Aug 2010 07:20:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837291#M6154</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-18T07:20:39Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837292#M6155</link>
      <description>Dear Konstantin&lt;BR /&gt;&lt;BR /&gt;We have many cases, 10000 to millions of equations. &lt;BR /&gt;In this case, I tested two cases, 10000 and 150000 equations.&lt;BR /&gt;&lt;BR /&gt;Our applicationtakesiterative solve method.&lt;BR /&gt;I mean we take reordering and factorization only one time,&lt;BR /&gt;and then we solve the equations repetitively as changing right hand side.&lt;BR /&gt;Therefore, solve time is very critical in our application.&lt;BR /&gt;So we want to reduce solve phase run-time.&lt;BR /&gt;&lt;BR /&gt;As I mentioned above,&lt;BR /&gt;I used 4-RHS iteratively because of memory capacity.&lt;BR /&gt;So I set the new 4-RHS every solve phase.&lt;BR /&gt;Except this method, all things are the same as single RHS.&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang&lt;BR /&gt;</description>
      <pubDate>Wed, 18 Aug 2010 07:43:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837292#M6155</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-18T07:43:54Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837293#M6156</link>
      <description>Hi,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;I have a few more questions to have our input conditions 'aligned':&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;1) Did you link against libmkl_intel_thread library, not libmkl_sequential?&lt;/DIV&gt;&lt;DIV&gt;2) Did you set OMP_NUM_THREADS to any value?&lt;/DIV&gt;&lt;DIV&gt;3) Which type of matrix did you use in your test?&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thanks,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Wed, 18 Aug 2010 09:51:40 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837293#M6156</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-18T09:51:40Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837294#M6157</link>
      <description>Dear Konstantin&lt;BR /&gt;&lt;BR /&gt;Below is my answers.&lt;BR /&gt;1) We linked libmkl_intel_thread library.&lt;BR /&gt;So we operated reordering &amp;amp; factorization faster than single thread.&lt;BR /&gt;2) I set OMP_NUM_THREADS = 4, because our system has 4-core.&lt;BR /&gt;3) Positive definite symmetric matrix&lt;BR /&gt;&lt;BR /&gt;I checked just about changing thread library as libmkl_sequential.&lt;BR /&gt;The resultwas that single thread(libmkl_sequential) is much faster than &lt;BR /&gt;multi threads(libmkl_intel_thread)in phase=33.&lt;BR /&gt;I don't understand this situation.&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang</description>
      <pubDate>Thu, 19 Aug 2010 01:14:09 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837294#M6157</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-19T01:14:09Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837295#M6158</link>
      <description>Bosun,&lt;DIV&gt;&lt;SPAN style="font-size: 10.8333px;"&gt;I think the fastest way resolving/reproducing this issue isto provide for us the test case with the input data you are encountering this problem with.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN style="font-size: 10.8333px;"&gt;--Gennady&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;BR /&gt;&lt;/DIV&gt;</description>
      <pubDate>Thu, 19 Aug 2010 04:42:00 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837295#M6158</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-08-19T04:42:00Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837296#M6159</link>
      <description>&lt;P&gt;DearFedorov.&lt;BR /&gt;&lt;BR /&gt;It is impossible to provide the test set.&lt;BR /&gt;But I'm sure it is not a test case problem.&lt;BR /&gt;Test cases are justpositive definite symmetric&lt;BR /&gt;simple matrices which were provedas normal sets&lt;BR /&gt;in case of other solver and PARDISO single RHS mode.&lt;BR /&gt;As you already read above articles,&lt;BR /&gt;all operationswere normal. &lt;BR /&gt;The problem is the speed of solve phase =33&lt;BR /&gt;in multiple RHS mode.&lt;BR /&gt;&lt;BR /&gt;Have you tried such like this application?&lt;BR /&gt;How was it? Speed of MRHS was faster than single RHS?&lt;BR /&gt;&lt;BR /&gt;Thank you.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang&lt;/P&gt;</description>
      <pubDate>Thu, 19 Aug 2010 04:51:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837296#M6159</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-19T04:51:53Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837297#M6160</link>
      <description>Hi Bosun,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Indeed, we reproduced the problem you described: performance of parallel solution phase with a few RHS is low. We will investigate the problem and try to fix it ASAP.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;You may try to inprove performance of parallel solve phase by increasing the number of RHS (if possible) to 16-32. I hope this will let you reduce the time per computation of 1 RHS.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Thank you,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
      <pubDate>Fri, 20 Aug 2010 04:52:51 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837297#M6160</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-20T04:52:51Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837298#M6161</link>
      <description>&lt;P&gt;Hi Bosun,&lt;BR /&gt;
Thanks for your problem you raised. This issue has been submitted to our
internal development tracking database for further investigation, we will
inform you once a new update becomes available.&lt;BR /&gt;
Here is a bug tracking number for your reference: DPD200190971&lt;BR /&gt;
Regards, Gennady&lt;/P&gt;</description>
      <pubDate>Fri, 20 Aug 2010 05:13:59 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837298#M6161</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-08-20T05:13:59Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837299#M6162</link>
      <description>&lt;P&gt;Hi Konstantin.&lt;BR /&gt;&lt;BR /&gt;Thank you for your reply.&lt;BR /&gt;I'm afraid it has a problem.&lt;BR /&gt;I hope youresolveit ASAP.&lt;BR /&gt;&lt;BR /&gt;And I have a question about increasing of RHS numbers.&lt;BR /&gt;Our system has only 4 cores.&lt;BR /&gt;In this sytem, is it meaningful to use 16-32 RHS?&lt;BR /&gt;&lt;BR /&gt;Thank you.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang&lt;/P&gt;</description>
      <pubDate>Fri, 20 Aug 2010 05:19:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837299#M6162</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-20T05:19:25Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837300#M6163</link>
      <description>&lt;DIV id="Normalcontent" CONVITEM="http://schemas.microsoft.com/2008/10/sip/convItems" xmlns="http://schemas.microsoft.com/2008/10/sip/convItems" RTC="urn:microsoft-rtc-xslt-functions" MSXSL="urn:schemas-microsoft-com:xslt" XS="http://www.w3.org/2001/XMLSchema"&gt;&lt;DIV id="imcontent"&gt;
&lt;DIV&gt;
&lt;DIV&gt;
&lt;DIV&gt;Hi Bosun,&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Of course, solving 
more RHS is meaningful because parallel solve algorithm (where N RHS is just 
splitted via K threads) works better when N &amp;gt; K ("better" means that 
computational time per 1 RHS decreased).I would say, in this case theading 
overhead is not significant as far as each process has more work to do.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;/DIV&gt;
&lt;DIV&gt;And it's only a 
question of your algorithm: how many independent RHS it has to solve at once? If 
it can be 16-32 or even more: please try 
it.&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
      <pubDate>Fri, 20 Aug 2010 09:57:14 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837300#M6163</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-20T09:57:14Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837301#M6164</link>
      <description>Hi Konstantin&lt;BR /&gt;&lt;BR /&gt;I tested your advise.&lt;BR /&gt;But, N &amp;gt; K case is also slower than 1 RHS case.&lt;BR /&gt;I hope the problem will be resolved ASAP.&lt;BR /&gt;&lt;BR /&gt;Thankyou.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang</description>
      <pubDate>Sat, 21 Aug 2010 05:37:01 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837301#M6164</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-21T05:37:01Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837302#M6165</link>
      <description>Hi Bosun,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;What do you mean saying: "But, N &amp;gt; K case is also slower than 1 RHS case."? Do you mean that a time per 1 RHS is larger in case of many RHS, i.e.: time_1RHS &amp;lt; time_NRHS/N&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Of course, solving N RHS cannot be faster than 1 RHS, but the time per 1 RHS (one right hand side) is better.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Here's an example:&lt;/DIV&gt;&lt;DIV&gt;I used Linux server similar to yours, and set OMP_NUM_THREADS=4:&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;1  RHS - 0.033 sec&lt;/DIV&gt;&lt;DIV&gt;32 RHS - 0.175 sec&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;In other words, in the second case I got 0.0055 sec per 1 RHS, or ~6x scalability. Note, that such a good scalability was achieved because more efficient level-3 BLAS was used in the implementation of NRHS solving phase (N RHS vectors are treated as a matrix).&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
      <pubDate>Sun, 22 Aug 2010 08:03:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837302#M6165</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-22T08:03:30Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837303#M6166</link>
      <description>Hi Konstantin.&lt;BR /&gt;&lt;BR /&gt;Of course, I understood your explanation.&lt;BR /&gt;I mean, total run-time of NRHS was&lt;BR /&gt;slower than that of 1-RHS.&lt;BR /&gt;That is,each 1-RHSrun-time wasn't improved by &lt;BR /&gt;increasing of RHS numbers.&lt;BR /&gt;In our exmple, 16-RHS test case, the solve time were&lt;BR /&gt;1 RHS - 0.023120s&lt;BR /&gt;16 RHS - 0.479048s&lt;BR /&gt;where 0.023120 * 16 = 0.368s&lt;BR /&gt;This means 16RHS slower than 1RHS.&lt;BR /&gt;&lt;BR /&gt;Regards.&lt;BR /&gt;B. Hwang</description>
      <pubDate>Mon, 23 Aug 2010 06:23:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837303#M6166</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-23T06:23:27Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837304#M6167</link>
      <description>&lt;DIV&gt;Hi Bosun,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Ok, I'm glad that we both use the same terminology: all is clear here now, thanks.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;As I mentioned, my times are a bit better (with 16, 32 RHS, at least, I observe scalability from 2x to 6x in comparison with 1 RHS).&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Would you please specify an exact name of processor (like Intel Xeon CPU 5160) in order I can reproduce situation when NRHS=16 is slower than 1 RHS?&lt;/DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;P.S. As Gennady mentioned, we work on the trackerDPD200190971 re slow solve phase with small RHS number.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;</description>
      <pubDate>Mon, 23 Aug 2010 06:57:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837304#M6167</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-23T06:57:54Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837305#M6168</link>
      <description>Hi Konstantin.&lt;BR /&gt;&lt;BR /&gt;Our system is Intel Xeon 3GHz x2 c2 x86_64.&lt;BR /&gt;Is it right that you wanted to know?&lt;BR /&gt;&lt;BR /&gt;And I have a question!&lt;BR /&gt;We used TAUCS solver as our calculation engine.&lt;BR /&gt;Despite PARDISO was not working in phase=33 as multi thread,&lt;BR /&gt;the PARDISO singleprocessis more twice faster than TAUCS solver.&lt;BR /&gt;&lt;BR /&gt;Could you tell me the reason? Because of optimized compiler?&lt;BR /&gt;Or any special algorithms?&lt;BR /&gt;Please explain simple reasons about that.&lt;BR /&gt;&lt;BR /&gt;Best regards.&lt;BR /&gt;B. Hwang&lt;BR /&gt;</description>
      <pubDate>Mon, 23 Aug 2010 08:58:11 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837305#M6168</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-23T08:58:11Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837306#M6169</link>
      <description>Hi Bosun,&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Could you send me the output of this command (or attach)?&lt;/DIV&gt;&lt;DIV&gt;dmesg | grep CPU | sort -u&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Re performance difference with TAUCS - I do not know the exact reason. Probably, it comes from more efficient matrix-vector operations which is implemented in MKL BLAS and are used in phase=33. But I'm not 100% sure.&lt;/DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;DIV&gt;Regards,&lt;/DIV&gt;&lt;DIV&gt;Konstantin&lt;/DIV&gt;&lt;DIV&gt;&lt;DIV&gt;&lt;/DIV&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 23 Aug 2010 09:53:52 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837306#M6169</guid>
      <dc:creator>Konstantin_A_Intel</dc:creator>
      <dc:date>2010-08-23T09:53:52Z</dc:date>
    </item>
    <item>
      <title>Qustion about multiple RHS</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837307#M6170</link>
      <description>Hi Konstantin&lt;BR /&gt;&lt;BR /&gt;The CPU is&lt;BR /&gt;&lt;BR /&gt;Intel Xeon CPU 5160 @ 3.00GHz&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;Thank you.</description>
      <pubDate>Tue, 24 Aug 2010 01:03:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Question-about-multiple-RHS/m-p/837307#M6170</guid>
      <dc:creator>Bosun_Hwang</dc:creator>
      <dc:date>2010-08-24T01:03:19Z</dc:date>
    </item>
  </channel>
</rss>

