Numpy + MKL only using 1 core of cpu on Ubuntu 16.04

xiao__yuanzheng · ‎05-05-2018

Hi,

I am using Ubuntu 16.04 on my pc with "Intel® Core™ i7-7700K CPU @ 4.20GHz × 8".

My numpy and scipy only use one cpu when I try to do some element calculation for my numpy ndarray.

Something like:

numpy.power(matrix, 1.5)

I compiled the numpy and scipy following https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl?page=1

The numpy configurations are as following.

blas_opt_info:

include_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/include']

library_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]

libraries = ['mkl_rt', 'pthread']

lapack_opt_info:

include_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/include']

library_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]

libraries = ['mkl_rt', 'pthread']

blas_mkl_info:

include_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/include']

library_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]

libraries = ['mkl_rt', 'pthread']

lapack_mkl_info:

include_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/include']

library_dirs = ['/opt/intel/compilers_and_libraries_2018/linux/mkl/lib/intel64']

define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]

libraries = ['mkl_rt', 'pthread']

I tried to modify environment variables like MKL_NUM_THREADS, OMP_NUM_THREADS, MKL_DOMAIN_NUM_THREADS, MKL_DYNAMIC, but they do nothing to my situation.

Thanks for your help.

xiao__yuanzheng · ‎05-05-2018

May I bring some more information.

The intel precompiled python runs well in my computer with multicores.

And the numpy I built can also do matrix product using multicores, but for element-wise product only one core is going to use (intel python will use at least 3).

Any suggestions? Thanks.

Ying_H_Intel · ‎05-07-2018

Hi Yuanzheng,

Your observation give the hint for the answer of your question :)
The build Numpy, Scipy with MKL is based on configuration:

define_macros = [('SCIPY_MKL_H', None), ('HAVE_CBLAS', None)]

So it adopts part of MKL, here CBLAS function http://www.netlib.org/blas/, which only have functions like matrix product gemm, matrix*vector etc. It don't include element-wise functions. So you can't see MKL acceleration or multi-core for such operation in your build.

And Intel distributed Python did more optimization, including MKL, DAAL, vector math, multithreading etc. So you will see more performance benefits .

Best Regards
Ying

xiao__yuanzheng · ‎05-09-2018

Hi Ying,

Do you have any suggestions for how to build numpy/scipy with those optimization functions?

Thanks.

Ying_H_Intel · ‎05-09-2018

Hi Yuanzheng,

What i can suggest,

maybe easier way , to use Intel distributed Python, which is free and compatible with other conda package.

second way, do change the numpy/scipy source code manually , for example change their Vector implementation with MKL VML function etc. As Intel python developer did.

third way, specific for your project, for example, evaluate the hot functions list you will use, then manually optimize them by your function to replace the numpy/scipy implementation.

Best Regards,
Ying