Intel® Distribution for Python*
Engage in discussions with community peers related to Python* applications and core computational packages.
413 Discussions

Facing slower performance with intel-numpy


I wanted to test how exponential moving average calculation would be faster with intel-numpy on intel python distribution. But my code runs slower on ipd when compared to native python. 

Setup: ​

  • Intel Distribution for Python 3.6.3
  • Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
  • GCC 4.8.2 20140120 - Red Hat 4.8.2-15

My code: 

def ema():
   values = np.random.randint(2,9,(5000)) 
   window = 20 
   start = time.time() 
   weights = np.exp(np.linspace(-1., 0., window)) 
   weights /= weights.sum() 
   a = np.convolve(values, weights, mode='full')[:len(values)] 
   a[:window] = a[window] 
   end = time.time() 
   return a

 The numbers I got:

  • Python - 0.00036263465881347656
  • Intel Python Distribution - 0.005644321441650391

I don't think I've made a mistake in the installation as I get expected speed up with the sample code in below link:

What could be the reason? What's the perfect way to use intel distributions for optimizing ema calculations?

0 Kudos
3 Replies


It is hard to conclude when execution times are on the order of milliseconds. Typically it is just a measurement noise. I advise increasing problem size or the number of repetitions to get conclusive results.

Thank you,

Sergey Maidanov

0 Kudos

Along the same theme, I just installed Intel's Python Distribution on my i9 7980XE system running Windows 10.  For my first performance test, I ran

import numpy as np
A = np.random.rand(30000,30000)
B =, A)

and was disappointed to see my CPU utilization nowhere near 100%, like I do when I run the same code with Python 3.7 and pip-installed numpy.  Under full CPU load, the above code takes just over one minute to run under the latter set-up, but takes more than five minutes with Intel's Python (I lost patience and killed it).


0 Kudos

I hope Intel can notice this issue. I am also using 7980XE and the fft2 performance is about 50% of the anaconda stock numpy.

Stock numpy takes 560ms to finish the test

import numpy as np
a= np.random.random([8000,8000])
%timeit b=np.fft.fft2(a)

Intel python 2019 takes 980ms:

import numpy as np
import mkl_fft as intel
%timeit b=intel.fft2(a)

7980XE cpu usage is only around 30% when using MKL, stock numpy can push 7980XE to 50%. Please fix this issue, giving up Intel python for now.


0 Kudos