<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How can I make Intel-MKL numpy *really* use all my CPU threads in Intel® Distribution for Python*</title>
    <link>https://community.intel.com/t5/Intel-Distribution-for-Python/How-can-I-make-Intel-MKL-numpy-really-use-all-my-CPU-threads/m-p/1146207#M1033</link>
    <description>&lt;P&gt;I just installed Intel-MKL numpy in a mostly new machine. Everything works well, with one exception: it keeps using only 4 CPU threads when my computer has 8.&lt;/P&gt;&lt;P&gt;Before anyone asks, I use Ubuntu 18.4 and when&amp;nbsp;I run the following in the terminal:&lt;/P&gt;
&lt;PRE class="brush:bash; class-name:dark;"&gt;cat /proc/cpuinfo | grep processor | wc -le&lt;/PRE&gt;

&lt;P&gt;I do get "8" as the output - so yeah, I do have 8 threads. Yet, I can see that only 4 threads are being used both if I check my System Monitor, if I use "top" in the terminal or if I run something like:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;In [1]: import os

In [2]: os.environ['MKL_VERBOSE']="1"

In [3]: import numpy as np
Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2019.0 Product build 20180829 for Intel(R) 64 architecture Intel(R) Advanced Vec
tor Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.60GHz lp64 intel_thread
MKL_VERBOSE SDOT(2,0x5622fcf3f9c0,1,0x5622fcf3f9c0,1) 2.06ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:4

In [4]: x = np.random.randn(1000)

In [5]: np.dot(x,x)
MKL_VERBOSE DDOT(1000,0x5622fd5b8280,1,0x5622fd5b8280,1) 31.63us CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:4
Out[5]: 941.6605280631609&lt;/PRE&gt;

&lt;P&gt;Let me reinforce this, to be very clear: it's not that Intel-MKL is detecting cores and I am talking about threads. No. The example above (and any other) show that only 4 of my threads get used during the numpy matrix operations - with the other 4 threads sitting close to idle.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tried what is told in &lt;A href="https://software.intel.com/en-us/forums/intel-distribution-for-python/topic/567669"&gt;this forum post&lt;/A&gt;, i.e. going Ubuntu's terminal and doing:&lt;/P&gt;

&lt;PRE class="brush:bash; class-name:dark;"&gt;set MKL_NUM_THREADS=7
&lt;/PRE&gt;

&lt;P&gt;I also tried:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;import mkl
mkl.set_num_threads(7)&lt;/PRE&gt;

&lt;P&gt;But it seems that Intel-MKL is wrongly detecting my max number of threads to be 4:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;In [36]: mkl.get_max_threads()
Out[36]: 4
&lt;/PRE&gt;

&lt;P&gt;It should not be this hard to make multithreaded libraries make use of all threads I have available. Anyhow, your time helping me fix this is much appreciated.&lt;/P&gt;</description>
    <pubDate>Wed, 15 May 2019 03:51:32 GMT</pubDate>
    <dc:creator>noturno</dc:creator>
    <dc:date>2019-05-15T03:51:32Z</dc:date>
    <item>
      <title>How can I make Intel-MKL numpy *really* use all my CPU threads</title>
      <link>https://community.intel.com/t5/Intel-Distribution-for-Python/How-can-I-make-Intel-MKL-numpy-really-use-all-my-CPU-threads/m-p/1146207#M1033</link>
      <description>&lt;P&gt;I just installed Intel-MKL numpy in a mostly new machine. Everything works well, with one exception: it keeps using only 4 CPU threads when my computer has 8.&lt;/P&gt;&lt;P&gt;Before anyone asks, I use Ubuntu 18.4 and when&amp;nbsp;I run the following in the terminal:&lt;/P&gt;
&lt;PRE class="brush:bash; class-name:dark;"&gt;cat /proc/cpuinfo | grep processor | wc -le&lt;/PRE&gt;

&lt;P&gt;I do get "8" as the output - so yeah, I do have 8 threads. Yet, I can see that only 4 threads are being used both if I check my System Monitor, if I use "top" in the terminal or if I run something like:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;In [1]: import os

In [2]: os.environ['MKL_VERBOSE']="1"

In [3]: import numpy as np
Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2019.0 Product build 20180829 for Intel(R) 64 architecture Intel(R) Advanced Vec
tor Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.60GHz lp64 intel_thread
MKL_VERBOSE SDOT(2,0x5622fcf3f9c0,1,0x5622fcf3f9c0,1) 2.06ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:4

In [4]: x = np.random.randn(1000)

In [5]: np.dot(x,x)
MKL_VERBOSE DDOT(1000,0x5622fd5b8280,1,0x5622fd5b8280,1) 31.63us CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:4
Out[5]: 941.6605280631609&lt;/PRE&gt;

&lt;P&gt;Let me reinforce this, to be very clear: it's not that Intel-MKL is detecting cores and I am talking about threads. No. The example above (and any other) show that only 4 of my threads get used during the numpy matrix operations - with the other 4 threads sitting close to idle.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have tried what is told in &lt;A href="https://software.intel.com/en-us/forums/intel-distribution-for-python/topic/567669"&gt;this forum post&lt;/A&gt;, i.e. going Ubuntu's terminal and doing:&lt;/P&gt;

&lt;PRE class="brush:bash; class-name:dark;"&gt;set MKL_NUM_THREADS=7
&lt;/PRE&gt;

&lt;P&gt;I also tried:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;import mkl
mkl.set_num_threads(7)&lt;/PRE&gt;

&lt;P&gt;But it seems that Intel-MKL is wrongly detecting my max number of threads to be 4:&lt;/P&gt;

&lt;PRE class="brush:python; class-name:dark;"&gt;In [36]: mkl.get_max_threads()
Out[36]: 4
&lt;/PRE&gt;

&lt;P&gt;It should not be this hard to make multithreaded libraries make use of all threads I have available. Anyhow, your time helping me fix this is much appreciated.&lt;/P&gt;</description>
      <pubDate>Wed, 15 May 2019 03:51:32 GMT</pubDate>
      <guid>https://community.intel.com/t5/Intel-Distribution-for-Python/How-can-I-make-Intel-MKL-numpy-really-use-all-my-CPU-threads/m-p/1146207#M1033</guid>
      <dc:creator>noturno</dc:creator>
      <dc:date>2019-05-15T03:51:32Z</dc:date>
    </item>
  </channel>
</rss>

