intel python 2017.0.018 does not auto offload to the Phi-Co processor

Andrew_D_3 · ‎06-16-2016

I have been trying to get Anton's simple benchmark running on the Phi 31xx card. I have been told that the latest Intel Python distribution will auto-offload - It does not.

The user/environment can run the micperf and micsmc shows that the Phi is operating. I assume that if the micperf runs for a given user and shell environment then there is hope to get python running. Attached are some details about the environment - -m TBB does provide a speed increase because the host processor goes full out - but micsmc shows 0 activity on the phi card.

Your help in getting this working would be greatly appreciated.

Using l_python27_bu_2017.0.018
adasys@smarties:~$ which python
/home/adasys/intel/intelpython27/bin/python
adasys@smarties:~$ env |sort
<snip> 
LD_LIBRARY_PATH=/home/adasys/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64_lin:/home/adasys/compilers_and_libraries_2016.3.210/linux/mpirt/lib/intel64_lin
LIBRARY_PATH=/home/adasys/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64_lin


MIC_ENV_PREFIX=MIC
MIC_LD_LIBRARY_PATH=/home/adasys/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64_lin_mic
MIC_OMP_NUM_THREADS=120
MKL_MIC_ENABLE=1
NLSPATH=/home/adasys/compilers_and_libraries_2016.3.210/linux/compiler/lib/intel64_lin/locale/%l_%t/%N
OFFLOAD_REPORT=2

PATH=/usr/src/micperf/micp/micp/scripts:/home/adasys/bin:/home/adasys/compilers_and_libraries_2016.3.210/linux/mpirt/bin/intel64_lin:/usr/src/micperf/micp/micp/scripts:/home/adasys/intel/intelpython27/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

PYTHONPATH=/usr/src/micperf/micp:


<snip>
adasys@smarties:~$ python /home/python/dask/bench.py 
21.0987641811
adasys@smarties:~$ python -m TBB /home/python/dask/bench.py                   
18.8681139946

Anton_M_Intel · ‎06-16-2016

Hi Andrew,

Thank you for the question. TBB does not do any offloading, it manages software threads on the same CPUs where it was enabled.

Auto-offloading is a separate story and is a property of Intel MKL which enables Numpy/Scipy and other dependent Python packages.

Please refer to this blog for what functions are supported w.r.t. automatic offloading in Intel MKL: https://software.intel.com/en-us/articles/intel-mkl-automatic-offload-enabled-functions-for-intel-xeon-phi-coprocessors

And please refer here for how to enable automatic offloading: https://software.intel.com/en-us/node/528599

However, for Intel Distribution of Python, you need to install mkl_mic_rt (for Beta releases) or mkl_mic (for the Gold release) from anaconda.org/intel

Regards,
Anton

Randy_M_1 · ‎08-24-2017

Hi Anton,

Thank you for the excellent info. My question is about newer Intel Xeon Phi KNL x200 64-core CPU - not the x100 co-processor. I am using a KNL x200 DAP workstation with Anaconda Intel Python 3_full installed on it. What environment variables do I need to set to enable automatic off-load from Python 3 scripts?

Current environment variables that I am using are :

ACCEPT_INTEL_PYTHON_EULA=yes
MKL_MIC_ENABLE=1
MKL_DYNAMIC=true

Are there any other environment variables that you would recommend setting? Should I set OMP_NUM_THREADS or MIC_OMP_NUM_THREADS values, or can they be discovered automatically?

Thank you!