<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Thank you very much! in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110850#M24339</link>
    <description>&lt;P&gt;Thank you very much!&lt;/P&gt;</description>
    <pubDate>Tue, 12 Jul 2016 08:57:26 GMT</pubDate>
    <dc:creator>Fabian_K_1</dc:creator>
    <dc:date>2016-07-12T08:57:26Z</dc:date>
    <item>
      <title>Complexity of functions ?potrs, ?potrf and cblas_dgemm</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110848#M24337</link>
      <description>&lt;P&gt;Dear MKL forum,&lt;/P&gt;

&lt;P&gt;I'm using the functions "?potrf" for Cholesky factorization of a matrix and "?potrs" for solving a linear equation system. Additionally I need the function "cblas_dgemm" (matrix multiplication) for further calculations. These functions are used in a distributed system with multiple servers, but I need the exact complexity for each of these algorithms for optimal load balancing (see: big O notation). I don't prefer to use the complexities given in common literature because the MKL functions are optimized and don't work with the common complexities.&lt;/P&gt;

&lt;P&gt;Can you help me out?&lt;/P&gt;

&lt;P&gt;Best regards&lt;/P&gt;</description>
      <pubDate>Tue, 28 Jun 2016 11:08:42 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110848#M24337</guid>
      <dc:creator>Fabian_K_1</dc:creator>
      <dc:date>2016-06-28T11:08:42Z</dc:date>
    </item>
    <item>
      <title>Hi Fabian, </title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110849#M24338</link>
      <description>&lt;P&gt;Hi Fabian,&amp;nbsp;&lt;/P&gt;

&lt;P&gt;It is good question, I guess i understand your questions, but I'm afraid that they &lt;SPAN data-aligning="#blng_src_1_2,#blng_tran_1_2" id="blng_tran_1_2" style="margin: 0px; padding: 0px; border: 0px; outline: 0px; color: rgb(67, 67, 67); font-family: Tahoma, Arial; font-size: 12px; line-height: 24px; widows: auto; background-color: rgb(242, 242, 242);"&gt;likely&lt;/SPAN&gt;&lt;SPAN style="color: rgb(67, 67, 67); font-family: Tahoma, Arial; font-size: 12px; line-height: 24px; widows: auto; background-color: rgb(242, 242, 242);"&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN data-aligning="#blng_src_1_4,#blng_tran_1_4" id="blng_tran_1_4" style="margin: 0px; padding: 0px; border: 0px; outline: 0px; color: rgb(67, 67, 67); font-family: Tahoma, Arial; font-size: 12px; line-height: 24px; widows: auto; background-color: rgb(242, 242, 242);"&gt;lead to&lt;/SPAN&gt;&lt;SPAN style="color: rgb(67, 67, 67); font-family: Tahoma, Arial; font-size: 12px; line-height: 24px; widows: auto; background-color: rgb(242, 242, 242);"&gt;&amp;nbsp;some ambiguity&amp;nbsp;&lt;/SPAN&gt;about algorithm complexity and MKL optimization. &amp;nbsp;&amp;nbsp;&lt;/P&gt;

&lt;P&gt;Actually, MKL function like cblas_dgemm etc, we don't change the algorithm complexity. &amp;nbsp;The &amp;nbsp;principle for &amp;nbsp;optimize&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;&amp;nbsp;the MKL function is to &amp;nbsp;maximumly &lt;/SPAN&gt;utilize&amp;nbsp;hardware resource, for example, vectorized ( fully use SIMD introduction), threaded. (all core are used). &amp;nbsp;&lt;/P&gt;

&lt;P&gt;The&amp;nbsp;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;"?potrf &amp;nbsp;and&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;cblas_dgemm &amp;nbsp;are vectorized and threaded, and be multi-core ready. &amp;nbsp;So if you use these functions in &lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;a distributed system with multiple servers, unless you wrote high-level threads (like MPI process) to distribute the task, in most of case, you can call them directly and get the multi-cores used with good performance. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;If you do want distribute task your self, &amp;nbsp;for example, &amp;nbsp;5 1000x1000 &lt;/SPAN&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;cblas_dgemm&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;&amp;nbsp; on one sever and &amp;nbsp;2 12500x12500 cblas_dgemm on another server, if you may worry about the imblance, &amp;nbsp;you can consider the relationship like&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt;algorithm complexity vs.hardware resource etc. &amp;nbsp;But you can image it is not linear, even no exact&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN style="color: rgb(67, 67, 67); font-family: Tahoma, Arial; font-size: 12px; line-height: 24px; widows: auto; background-color: rgb(189, 213, 238);"&gt;formula&lt;/SPAN&gt;&lt;SPAN style="font-size: 13.008px; line-height: 19.512px;"&gt; to discrible it. So i may recommend&lt;/SPAN&gt;&lt;SPAN style="font-size: 12px; line-height: 18px;"&gt;&amp;nbsp;to use system profile tools, for example, if Intel MPI program, you use the ITAC to profile and adjust the workload. &amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Best Regards,&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&lt;SPAN style="font-size: 1em; line-height: 1.5;"&gt;Ying&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;

&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 01 Jul 2016 07:04:25 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110849#M24338</guid>
      <dc:creator>Ying_H_Intel</dc:creator>
      <dc:date>2016-07-01T07:04:25Z</dc:date>
    </item>
    <item>
      <title>Thank you very much!</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110850#M24339</link>
      <description>&lt;P&gt;Thank you very much!&lt;/P&gt;</description>
      <pubDate>Tue, 12 Jul 2016 08:57:26 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Complexity-of-functions-potrs-potrf-and-cblas-dgemm/m-p/1110850#M24339</guid>
      <dc:creator>Fabian_K_1</dc:creator>
      <dc:date>2016-07-12T08:57:26Z</dc:date>
    </item>
  </channel>
</rss>

