<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Performance of the test cases in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/What-performance-I-should-expect-from-following-code/m-p/918902#M12835</link>
    <description>Performance of the test cases depends on an Intel instruction set selected during CPU dispatching ( mkl_core.dll -&amp;gt; mkl_rt.dll -&amp;gt; some MKL CPU dispatching DLL ). You have not provided any details about OS and hardware.</description>
    <pubDate>Fri, 06 Sep 2013 13:14:39 GMT</pubDate>
    <dc:creator>SergeyKostrov</dc:creator>
    <dc:date>2013-09-06T13:14:39Z</dc:date>
    <item>
      <title>What performance I should expect from following code</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/What-performance-I-should-expect-from-following-code/m-p/918901#M12834</link>
      <description>&lt;P&gt;Consider following two part of the codes:&lt;/P&gt;
&lt;P&gt;/* Perform LU factorization and store in DSS_handle */&lt;BR /&gt;for(k = 0; k &amp;nbsp;&amp;lt; N; k++){&lt;BR /&gt;gettimeofday(&amp;amp;stTime, NULL);&lt;BR /&gt;//DSS solver options&lt;BR /&gt;MKL_INT solOpt = (MKL_DSS_DEFAULTS | MKL_DSS_REFINEMENT_OFF) | MKL_DSS_TRANSPOSE_SOLVE;&lt;BR /&gt;MKL_INT nRhs = 3;&lt;BR /&gt;dss_solve_real(DSS_handle, solOpt, bufferRHS, nRhs, bufferX3);&lt;BR /&gt;dssSolCnt++;&lt;BR /&gt;gettimeofday(&amp;amp;endTime, NULL);&lt;BR /&gt;dssSolTime += (double)(endTime.tv_sec*1000000 + endTime.tv_usec - stTime.tv_sec*1000000 - stTime.tv_usec);&lt;BR /&gt;/* Do some other things */&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;For this code, dssSolTime, which represents the time required to performe forward and backward solutions, is 19.87sec for a 3408 * 3408 matrix.&lt;/P&gt;
&lt;P&gt;Now, if I do the same calculations sequentially using following code,&lt;/P&gt;
&lt;P&gt;/* Perform LU factorization and store in DSS_handle */&lt;BR /&gt;for(k = 0; k &amp;nbsp;&amp;lt; N; k++){&lt;BR /&gt;gettimeofday(&amp;amp;stTime, NULL);&lt;BR /&gt;//DSS solver options&lt;BR /&gt;MKL_INT solOpt = (MKL_DSS_DEFAULTS | MKL_DSS_REFINEMENT_OFF) | MKL_DSS_TRANSPOSE_SOLVE;&lt;BR /&gt;MKL_INT nRhs = 1;&lt;BR /&gt;dss_solve_real(DSS_handle, solOpt, bufferRHS, nRhs, bufferX3);&lt;BR /&gt;dss_solve_real(DSS_handle, solOpt, bufferRHS+numOfEqs, nRhs, bufferX3+numOfEqs);&lt;BR /&gt;dss_solve_real(DSS_handle, solOpt, bufferRHS+2*numOfEqs, nRhs, bufferX3+2*numOfEqs);&lt;BR /&gt;dssSolCnt++;&lt;BR /&gt;gettimeofday(&amp;amp;endTime, NULL);&lt;BR /&gt;dssSolTime += (double)(endTime.tv_sec*1000000 + endTime.tv_usec - stTime.tv_sec*1000000 - stTime.tv_usec);&lt;BR /&gt;/* Do some other things */&lt;BR /&gt;}&lt;/P&gt;
&lt;P&gt;it completes the computations much faster anf dssSolTime will be 2.04sec for the matrix (almost 10 times faster when I ask dss_solve_real to solve for all righ-hand-side vectors.)&lt;/P&gt;
&lt;P&gt;I assumed that dss_solve_real is smart enough to create three threads to solve for all right-hand side vectors simultaneously. Therefore, I expected first code to be three times faster than second code. But, the huge performance degradation implies that I may be missing something here. So, it is appreciated if you let me know whether or not dss_solve_real can solve for three right-hand-side vectors in parallel. Also, kindly let me know what I should logically expect from these codes and which one should be faster.&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;</description>
      <pubDate>Wed, 04 Sep 2013 22:35:35 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/What-performance-I-should-expect-from-following-code/m-p/918901#M12834</guid>
      <dc:creator>Pouya_Z_</dc:creator>
      <dc:date>2013-09-04T22:35:35Z</dc:date>
    </item>
    <item>
      <title>Performance of the test cases</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/What-performance-I-should-expect-from-following-code/m-p/918902#M12835</link>
      <description>Performance of the test cases depends on an Intel instruction set selected during CPU dispatching ( mkl_core.dll -&amp;gt; mkl_rt.dll -&amp;gt; some MKL CPU dispatching DLL ). You have not provided any details about OS and hardware.</description>
      <pubDate>Fri, 06 Sep 2013 13:14:39 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/What-performance-I-should-expect-from-following-code/m-p/918902#M12835</guid>
      <dc:creator>SergeyKostrov</dc:creator>
      <dc:date>2013-09-06T13:14:39Z</dc:date>
    </item>
  </channel>
</rss>

