No Acceleration with IDP

kouatchou__jules · ‎04-30-2018

Hi,

I recently received from Intel the Intel Distribution for Python (IDP) and installed it in an Intel based cluster. I used it to see how it accelerates the test cases (I wrote) presented in:

https://modelingguru.nasa.gov/docs/DOC-2676

and did not see any gain with respect to Python in Anaconda and in IDP derived from Anaconda. I am really disappointed since I was expecting accelerations with IDP.

Could you letting me know if I need to do something (for instance specific installation procedures) in order to obtain better results with IDP?

Thank you in advance for your assistance.

Regards,

Jules Kouatchou

Preethi_V_Intel · ‎04-30-2018

Hi Jules,

IDP uses Intel MKL optimizations to accelerate numpy and scipy libraries. Do the test cases utilize either of these libraries? If yes, can you attach a sample code?

Thanks

Preethi

kouatchou__jules · ‎04-30-2018

Preethi,

Thank you for responding to my request.

To simplify the process, I am interested in the two test cases presented In:

https://software.intel.com/en-us/forums/intel-distribution-for-python/topic/777328

When I run them, IDP (that I obtained from Intel), Anaconda and IDP from Anaconda give the same elapsed times.

I also used the Gauss Legendre quadrature (see code below) and did not see any difference.

Thank you for your assistance.

Regards,

Jules

#-----------------------------------------------------------------

import numpy as np
from scipy import integrate
from numpy import *
import sys

f = lambda x: np.exp(x)

order = int(sys.argv[1])
a = -3.0
b = 3.0

# Gauss-Legendre (default interval is [-1, 1])
x, w = np.polynomial.legendre.leggauss(order)
# Translate x values from the interval [-1, 1] to [a, b]
t = 0.5*(x + 1)*(b - a) + a
gauss = sum(w * f(t)) * 0.5*(b - a)

kouatchou__jules · ‎04-30-2018

Preethi,

I am sorry that I provided the wrong link in my previous message. Her is the right one:

https://www.infoworld.com/article/3187484/software/how-does-a-20x-speed-up-in-python-grab-you.html

Regards,

Jules

Oleksandr_P_Intel · ‎05-02-2018

Hi Jules,

Intel is striving to enable as many Python developers/users as possible to utilize Intel hardware to its fullest.

Intel (R) Distribution for Python* was created to make fast delivery of these optimizations to the community possible, but the ultimate goal was to accomplish even wider adoption through upstreaming and partnership with Python distributors.

Anaconda recently adopted our patches, see https://github.com/AnacondaRecipes/numpy-feedstock/tree/master/recipe, and thus performance of NumPy-based Python code, as run in Intel Distribution for Python* and as run in default Anaconda, are comparable to each other.

Consider three conda environments:

conda create -n idp -c intel ipython numpy scipy python=3 --yes
conda create -n anac5 ipython numpy scipy python=3 --yes
conda create -n anac5-nomkl ipython nomkl numpy scipy python=3 --yes

I use the following snippet for performance comparison:

import numpy as np
import datetime as dt
import sys

dim = 2000
x = np.random.randn(dim, dim) + 1j * np.random.randn(dim, dim)

if len(sys.argv) < 1:
        print('Usage:')
        print('     ./fft.py N')
        print('Please specify the number of iterations.')
        sys.exit()

N = int(sys.argv[1])

begTime = dt.datetime.now()
for __ in range(N):
        y = np.fft.fft2(x)
endTime = dt.datetime.now()

diffTime = endTime - begTime
print('Time for 2D FFT calculations (',N,'):', diffTime.total_seconds(),'s')

With the following results:

(anac5) [20:01:02 skl-ubuntu perfQ]$ python fft.py 100
Time for 2D FFT calculations ( 100 ): 0.558279 s

(anac5) [20:01:05 skl-ubuntu perfQ]$ . activate idp
(idp) [07:11:36 skl-ubuntu perfQ]$ python fft.py 100
Time for 2D FFT calculations ( 100 ): 0.56407 s
(idp) [07:11:39 skl-ubuntu perfQ]$ python fft.py 100
Time for 2D FFT calculations ( 100 ): 0.482773 s

(idp) [07:11:48 skl-ubuntu perfQ]$ . activate anac5-nomkl
(anac5-nomkl) [07:11:58 skl-ubuntu perfQ]$ python fft.py 100
Time for 2D FFT calculations ( 100 ): 21.026044 s
(anac5-nomkl) [07:12:22 skl-ubuntu perfQ]$ . activate bare

(bare) [07:12:41 skl-ubuntu perfQ]$ python fft.py 100
Time for 2D FFT calculations ( 100 ): 21.188223 s

Here the environment bare is Anaconda's CPython interpreter and pip-installed numpy and scipy. As you can see nomkl build of NumPy by Anaconda performs on par with NumPy distributed through PyPI, while MKL-optimized NumPy performs on par with IDP.

Sincerely,
Oleksandr

kouatchou__jules · ‎05-02-2018

Oleksandr,

Thank you for your response. You clearly answered my question. I now understand that IDP, Anaconda and IDP derived from Anaconda should display comparable performance.

Regards,

Jules

abarb · ‎05-31-2018

After the Latest Update I can't run module TBB or this Simple test:

import time
import dask.array as da

t0 = time.time()

x = da.random.random((10000, 10000), chunks=(4096, 4096))
x.dot(x.T).sum().compute()

print(time.time() - t0)

https://software.intel.com/pt-br/node/779746

Edited: Solved by Todd (Intel), thanks