How can I make Intel-MKL numpy *really* use all my CPU threads
I just installed Intel-MKL numpy in a mostly new machine. Everything works well, with one exception: it keeps using only 4 CPU threads when my computer has 8.
Before anyone asks, I use Ubuntu 18.4 and when I run the following in the terminal:
cat /proc/cpuinfo | grep processor | wc -le
I do get "8" as the output - so yeah, I do have 8 threads. Yet, I can see that only 4 threads are being used both if I check my System Monitor, if I use "top" in the terminal or if I run something like:
In : import os
In : os.environ['MKL_VERBOSE']="1"
In : import numpy as np
Numpy + Intel(R) MKL: THREADING LAYER: (null)
Numpy + Intel(R) MKL: setting Intel(R) MKL to use INTEL OpenMP runtime
Numpy + Intel(R) MKL: preloading libiomp5.so runtime
MKL_VERBOSE Intel(R) MKL 2019.0 Product build 20180829 for Intel(R) 64 architecture Intel(R) Advanced Vec
tor Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.60GHz lp64 intel_thread
MKL_VERBOSE SDOT(2,0x5622fcf3f9c0,1,0x5622fcf3f9c0,1) 2.06ms CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
In : x = np.random.randn(1000)
In : np.dot(x,x)
MKL_VERBOSE DDOT(1000,0x5622fd5b8280,1,0x5622fd5b8280,1) 31.63us CNR:OFF Dyn:1 FastMM:1 TID:0 NThr:4
Let me reinforce this, to be very clear: it's not that Intel-MKL is detecting cores and I am talking about threads. No. The example above (and any other) show that only 4 of my threads get used during the numpy matrix operations - with the other 4 threads sitting close to idle.
I have tried what is told in this forum post, i.e. going Ubuntu's terminal and doing:
I also tried:
But it seems that Intel-MKL is wrongly detecting my max number of threads to be 4:
In : mkl.get_max_threads()
It should not be this hard to make multithreaded libraries make use of all threads I have available. Anyhow, your time helping me fix this is much appreciated.