Facing slower performance with intel-numpy

Nagarajan__Sowmiya · ‎08-31-2018

I wanted to test how exponential moving average calculation would be faster with intel-numpy on intel python distribution. But my code runs slower on ipd when compared to native python.

Setup:

Intel Distribution for Python 3.6.3
Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz
GCC 4.8.2 20140120 - Red Hat 4.8.2-15

My code:

def ema():
   values = np.random.randint(2,9,(5000)) 
   window = 20 
   start = time.time() 
   weights = np.exp(np.linspace(-1., 0., window)) 
   weights /= weights.sum() 
   a = np.convolve(values, weights, mode='full')[:len(values)] 
   a[:window] = a[window] 
   end = time.time() 
   print(end-start) 
   return a

The numbers I got:

Python - 0.00036263465881347656
Intel Python Distribution - 0.005644321441650391

I don't think I've made a mistake in the installation as I get expected speed up with the sample code in below link:

https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl

What could be the reason? What's the perfect way to use intel distributions for optimizing ema calculations?

Sergey_M_Intel2 · ‎09-04-2018

Hello,

It is hard to conclude when execution times are on the order of milliseconds. Typically it is just a measurement noise. I advise increasing problem size or the number of repetitions to get conclusive results.

Thank you,

Sergey Maidanov

Geary__Robert · ‎09-06-2018

Along the same theme, I just installed Intel's Python Distribution on my i9 7980XE system running Windows 10. For my first performance test, I ran

import numpy as np
A = np.random.rand(30000,30000)
B = np.dot(A, A)

and was disappointed to see my CPU utilization nowhere near 100%, like I do when I run the same code with Python 3.7 and pip-installed numpy. Under full CPU load, the above code takes just over one minute to run under the latter set-up, but takes more than five minutes with Intel's Python (I lost patience and killed it).

Wang__Haining · ‎01-23-2019

I hope Intel can notice this issue. I am also using 7980XE and the fft2 performance is about 50% of the anaconda stock numpy.

Stock numpy takes 560ms to finish the test

import numpy as np
a= np.random.random([8000,8000])
%timeit b=np.fft.fft2(a)

Intel python 2019 takes 980ms:

import numpy as np
import mkl_fft as intel
a=np.random.random([8000,8000])
%timeit b=intel.fft2(a)

7980XE cpu usage is only around 30% when using MKL, stock numpy can push 7980XE to 50%. Please fix this issue, giving up Intel python for now.