<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Intel MKL (CBLAS) doesn't support more than 8 processors. Is it true ? in Intel® oneAPI Math Kernel Library</title>
    <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860336#M7432</link>
    <description>Hi !&lt;BR /&gt;&lt;BR /&gt;I have machine with 2 Intel Xeon CPUX5570 processors. So the number of logical cores is 16.&lt;BR /&gt;NowI am trying to perform&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[cpp]mkl_set_num_threads ( P );   
  
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1.0, A, N, B, N, 0.0, C, N );  [/cpp]&lt;/PRE&gt;
&lt;BR /&gt;Then for P &amp;gt; 1 and P &amp;lt;= 8 and P odd, program is executed on P - 1 processors.&lt;BR /&gt;For P &amp;gt; 8, program is executed always on 8 processors.&lt;BR /&gt;&lt;BR /&gt;How to force program to use more then 8 processors ?&lt;BR /&gt;&lt;BR /&gt;MKL Version used 10.2.4.032.&lt;BR /&gt;&lt;BR /&gt;Thanks.</description>
    <pubDate>Sun, 28 Mar 2010 07:33:27 GMT</pubDate>
    <dc:creator>yuryserdyuk</dc:creator>
    <dc:date>2010-03-28T07:33:27Z</dc:date>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it true ?</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860336#M7432</link>
      <description>Hi !&lt;BR /&gt;&lt;BR /&gt;I have machine with 2 Intel Xeon CPUX5570 processors. So the number of logical cores is 16.&lt;BR /&gt;NowI am trying to perform&lt;BR /&gt;&lt;BR /&gt;
&lt;PRE&gt;[cpp]mkl_set_num_threads ( P );   
  
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, N, N, N, 1.0, A, N, B, N, 0.0, C, N );  [/cpp]&lt;/PRE&gt;
&lt;BR /&gt;Then for P &amp;gt; 1 and P &amp;lt;= 8 and P odd, program is executed on P - 1 processors.&lt;BR /&gt;For P &amp;gt; 8, program is executed always on 8 processors.&lt;BR /&gt;&lt;BR /&gt;How to force program to use more then 8 processors ?&lt;BR /&gt;&lt;BR /&gt;MKL Version used 10.2.4.032.&lt;BR /&gt;&lt;BR /&gt;Thanks.</description>
      <pubDate>Sun, 28 Mar 2010 07:33:27 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860336#M7432</guid>
      <dc:creator>yuryserdyuk</dc:creator>
      <dc:date>2010-03-28T07:33:27Z</dc:date>
    </item>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860337#M7433</link>
      <description>Did you refer to previous discussions about how MKL uses 1 thread per core, unless you over-ride the default, in order to avoid accidental performance reduction?</description>
      <pubDate>Sun, 28 Mar 2010 14:33:54 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860337#M7433</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2010-03-28T14:33:54Z</dc:date>
    </item>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860338#M7434</link>
      <description>Yury,
&lt;DIV&gt;please try to change MKL_DYNAMIC variable:mkl_set_dynamic( FALSE ). See more details into User's Guide. Please pay attention - in this case you may have performancedegradation.
&lt;DIV&gt;--Gennady&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 29 Mar 2010 07:12:36 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860338#M7434</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-03-29T07:12:36Z</dc:date>
    </item>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860339#M7435</link>
      <description>Yes, you are right - mkl_set_dynamic helps, but the results degradate considerably:&lt;BR /&gt;
&lt;P&gt;&lt;/P&gt;
&lt;TABLE border="1" width="607" cellpadding="0" cellspacing="0"&gt;
&lt;TBODY&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;N&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;cblas_sgemm (8 proc)&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;cblas_sgemm(16 proc)&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;cuBLAS(Tesla 1060 GPU)&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;8192&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;6,06&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;7,26&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;2,71&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;10240&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;11,72&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;13,90&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;5,26&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;12288&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;20,23&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;24,32&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;9,07&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;14336&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;32,16&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;38,06&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;14,37&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;16384&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;48,46&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;58,80&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;21,42&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;TR&gt;
&lt;TD width="79" valign="top"&gt;
&lt;P align="center"&gt;18432&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;68,59&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="168" valign="top"&gt;
&lt;P align="center"&gt;82,60&lt;/P&gt;
&lt;/TD&gt;
&lt;TD width="192" valign="top"&gt;
&lt;P align="center"&gt;30,46&lt;/P&gt;
&lt;/TD&gt;
&lt;/TR&gt;
&lt;/TBODY&gt;
&lt;/TABLE&gt;
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;N is a matrix size, and time is given in seconds.&lt;BR /&gt;&lt;BR /&gt;So, obviously, Intel MKL doesn't scale more than 8 processors on processors with Hyper-Threading ...&lt;BR /&gt;&lt;BR /&gt;The same picture is observed for cblas_dgemm function ...&lt;/P&gt;</description>
      <pubDate>Mon, 29 Mar 2010 11:06:19 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860339#M7435</guid>
      <dc:creator>yuryserdyuk</dc:creator>
      <dc:date>2010-03-29T11:06:19Z</dc:date>
    </item>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860340#M7436</link>
      <description>This is an expectingbehaviorof Intel MKL. We don't recommend use HT enabled with this case.
&lt;DIV&gt;Please read more about into UserGuide "The use of Hyper-Threading Technology".
&lt;DIV&gt;--Gennady&lt;/DIV&gt;
&lt;/DIV&gt;</description>
      <pubDate>Mon, 29 Mar 2010 15:25:30 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860340#M7436</guid>
      <dc:creator>Gennady_F_Intel</dc:creator>
      <dc:date>2010-03-29T15:25:30Z</dc:date>
    </item>
    <item>
      <title>Intel MKL (CBLAS) doesn't support more than 8 processors. Is it</title>
      <link>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860341#M7437</link>
      <description>That section is in the user guide, found in the Documentation/en_us/mkl/ directory of the compiler installation, page 6-16. It can't be found by the search function in Adobe.&lt;BR /&gt;In short, as MKL schedules the floating point adder and multiplier to full effectiveness when running 1 thread per core, and the hyper-threads share the paths to higher level cache and memory, the interference effect of additional threads should not be a surprise.</description>
      <pubDate>Mon, 29 Mar 2010 15:54:53 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-oneAPI-Math-Kernel-Library/Intel-MKL-CBLAS-doesn-t-support-more-than-8-processors-Is-it/m-p/860341#M7437</guid>
      <dc:creator>TimP</dc:creator>
      <dc:date>2010-03-29T15:54:53Z</dc:date>
    </item>
  </channel>
</rss>

