Quote:Jose, Arun (Intel)

CLi37 · ‎11-22-2019

Please have an introduction to set the configuration of the fastest to run Tensorflow 2.0 in DevCloud.

My troubles today:

1) pip install tensorflow in Jyphon Notebook not work.

2) confused to use qsub and conda.

3) no speedup show up in my python/tensorflow testing

4) can FPGA run tensorflow?

5) no GPU available

6) need one-page shell introduction to test oneAPI for AI

Thanks

ArunJ_Intel · ‎11-25-2019

Hey Chang,

Thanks for reaching out to us. Please find below our responses to your queries:

1) pip install tensorflow in Jyphon Notebook not work.

We assume you mean jupyter notebook. In devcloud, inorder to install tensorflow to your base environment from jupyter notebook, please use the below command

!pip install tensorflow --user

Also do restart your kernel once the installation is complete to make sure the installation works.

2) confused to use qsub and conda.

qsub and conda are two unix commands which have entirely different purposes.

qsub is a Unix command for submitting jobs to a job scheduler, usually in cluster or grid computing. In devcloud, when connecting via SSH we get a landing node, which is not capable of doing any memory/compute intensive tasks . To do such tasks, we need compute node, which consists of high performing xeon processors. qsub is used to submit jobs to compute node in devcloud. You can find more details about submitting jobs via qsub in the below link
https://devcloud.intel.com/datacenter/learn/advanced-queue/about-the-job-queue

usage eg:
to get an interactive compute node from login/landing node use the below command

qsub -I

Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. You can follow the below link to know more about how to work with conda environments

https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

usage eg:
To create a conda environment and install tensorflow use below commands.

conda create --name envname
conda activate envname
conda install tensorflow

3) no speedup show up in my python/tensorflow testing

Could you be more clear on what are you comparing against each other in order to check for speed up. If you could share the scripts/code which you are executing, we can check the performance issue in detail.

4) can FPGA run tensorflow

Are you trying to run tensorflow with FPGA in devcloud?
Could you please elaborate more on this.

5) no GPU available

Devcloud does not have dedicated GPU but has iGPU installed. From the login node to access the nodes with iGPU use the below command

qsub -X -I -l nodes=1:gpu:ppn=2

6) need one-page shell introduction to test oneAPI for AI

oneAPI toolkits are installed in /opt/intel/inteloneapi/ folder. To setup the environment to start using the oneAPI toolkit please source setvars script and then activate any of the 3 environments available with AI toolkit (root,tensorflow, pytorch). And then carry on with your tensorflow pytorch or python workloads

eg:

source  /opt/intel/inteloneapi/setvars.sh
source activate tensorflow

ArunJ_Intel · ‎11-27-2019

Hey Chang,

Could you let us know if the solution provided helped.

Arun

CLi37 · ‎11-27-2019

My reply is in Bold words.

1) pip install tensorflow in Jyphon Notebook not work.

We assume you mean jupyter notebook. In devcloud, inorder to install tensorflow to your base environment from jupyter notebook, please use the below command

!pip install tensorflow --user

Also do restart your kernel once the installation is complete to make sure the installation works.

The --user is my own user name or just the word user?

2) confused to use qsub and conda.

qsub and conda are two unix commands which have entirely different purposes.

qsub is a Unix command for submitting jobs to a job scheduler, usually in cluster or grid computing. In devcloud, when connecting via SSH we get a landing node, which is not capable of doing any memory/compute intensive tasks . To do such tasks, we need compute node, which consists of high performing xeon processors. qsub is used to submit jobs to compute node in devcloud. You can find more details about submitting jobs via qsub in the below link
https://devcloud.intel.com/datacenter/learn/advanced-queue/about-the-job-queue

usage eg:
to get an interactive compute node from login/landing node use the below command

qsub -I

After qsub -I when I run "python my.py", is it running on my host (landing node) or assigned nodes? The host and console becomes the assigned node?

Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. You can follow the below link to know more about how to work with conda environments

https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

usage eg:
To create a conda environment and install tensorflow use below commands.

conda create --name envname
conda activate envname
conda install tensorflow

I saw somewhere said conda is going to be depreciated is it true?

3) no speedup show up in my python/tensorflow testing

Could you be more clear on what are you comparing against each other in order to check for speed up. If you could share the scripts/code which you are executing, we can check the performance issue in detail.

I mean 10x+ speed up. Obviously, when I run code on Nvidia GPU enabled cloud, it is more than 10x than run on my i7 laptop.
In Intel cloud I did not see this happened on Xeon CPU. I am doing machine learning training. The speed is critical. I expect that Intel cloud can speed up ML training with AI accelerators like Google's TPU. So far I did not see this.

4) can FPGA run tensorflow

Are you trying to run tensorflow with FPGA in devcloud?

YES. And I hope it can have 10x more performance for ML training than Xeon CPU. Otherwise does not make sense for my current applications.
Could you please elaborate more on this.

5) no GPU available

Devcloud does not have dedicated GPU but has iGPU installed. From the login node to access the nodes with iGPU use the below command

qsub -X -I -l nodes=1:gpu:ppn=2

tried this but did not see GPU is available in Tensorflow. no way to try it.

6) need one-page shell introduction to test oneAPI for AI

oneAPI toolkits are installed in /opt/intel/inteloneapi/ folder. To setup the environment to start using the oneAPI toolkit please source setvars script and then activate any of the 3 environments available with AI toolkit (root,tensorflow, pytorch). And then carry on with your tensorflow pytorch or python workloads

eg:

source  /opt/intel/inteloneapi/setvars.sh
source activate tensorflow

Honest to say I still do not know what the oneAPI really is. From your video and introduction, when talked about oneAPI you talked about qsub and sth else and toolkits. Are these related to one API?
Is it only one API to call everything? That is what I think. Is it just a marketing word?

Just talk about the experience to use the DevCloud not a complain.

Best

ArunJ_Intel · ‎11-27-2019

Hey Chang,

Please find below our responses.

1) pip install tensorflow in Jyphon Notebook not work.

Question:The --user is my own user name or just the word user?
Response:It is just the word "user"

2) confused to use qsub and conda.

Question:After qsub -I when I run "python my.py", is it running on my host (landing node) or assigned nodes? The host and console becomes the assigned node?
Response:with qsub -I you reach the compute node(assigned nodes). And yes your console becomes the assigned node

Question: I saw somewhere said conda is going to be depreciated is it true?
Response: Conda is a package very much in use and has no current news of being depreciated. The error message related to depreciation is for conda.compat module this can be solved by updating the conda.You can read more at the below link.
https://superuser.com/questions/1422008/conda-install-packagename-gives-deprecation-warning

3) no speedup show up in my python/tensorflow testing

Question: I mean 10x+ speed up. Obviously, when I run code on Nvidia GPU enabled cloud, it is more than 10x than run on my i7 laptop.
In Intel cloud I did not see this happened on Xeon CPU. I am doing machine learning training. The speed is critical. I expect that Intel cloud can speed up ML training with AI accelerators like Google's TPU. So far I did not see this.

Response: On running machine learning and deep learning work loads on cpu there are many methods to improve your speed/perfomance. Some of the options to speed up training on xeon CPUs are
(i)use intel optimised packages eg intel python, intel tensorflow etc. These packages have been optimised to provide optimal perfomance/speed on CPUs.
(ii)tweaking the OMP/KMP parameters for improving perfomance you could find a set of reccomended options at the below articles

https://software.intel.com/en-us/articles/tips-to-improve-performance-for-popular-deep-learning-frameworks-on-multi-core-cpus
https://software.intel.com/en-us/articles/maximize-tensorflow-performance-on-cpu-considerations-and-recommendations-for-inference

4) can FPGA run tensorflow

Question:YES. And I hope it can have 10x more performance for ML training than Xeon CPU. Otherwise does not make sense for my current applications.

Response: In the datacenter version of devcloud FPGA is not available on all the nodes. You should be using the FPGA vesion of devcloud for this. You could register for FPGA devcloud using the below link.

https://software.intel.com/en-us/devcloud/FPGA

For any queries related to FPGA you can post questions at the below forum.

https://forums.intel.com/s/topic/0TO0P0000001AUUWA2/intel-high-level-design

5) no GPU available

Question:tried this but did not see GPU is available in Tensorflow. no way to try it?

Response: iGPU is different from dedicated GPU and is not currently supported by tensorflow.

We will be getting experts/SMEs to respond on your query 5 and 6 in detail

CLi37 · ‎11-28-2019

Thanks for your instant answers.

I am wondering after I install my packages like pip install intel-tensorflow --user
my packages is in the host node or compute node? Then I use qsub -I to switch to compute node,
all my codes will be run on compute node with the installed packages?

I do not want to switch between two nodes. Any way for me to enter the compute node directly when launch SSH.
That is can I execute qsub -I after my login PuTTY?

ArunJ_Intel · ‎11-28-2019

Hey Chang,

When you install a package using pip or conda you are installing to your environment. Your file system and environments are accessible from login as well as compute nodes(because the home folder is NFS shared between login and compute nodes), so to answer your question your packages would be available in both login as well as compute nodes.

You can directly execute the" qsub -I "command just after you login putty and continue with all your tasks such as code execution, installation etc on the compute node assigned.

Arun

Preethi_V_Intel · ‎12-16-2019

@chang-li

Regarding your Q5, need one-page shell introduction to test oneAPI for AI

oneAPI is not "API" per se, but a product that features different toolkits to help support HPC and AI workloads with minimal installation and initial set-up. Additionally, One API is primarily designed to be able to offload CPU workloads to range of h/ws such as GPU, FPGA . All the components in each toolkits of the oneAPI are/will be written in DPC++ incorporating SYCL to offer other hardware support.

The product page if you have not seen already is here: https://software.intel.com/en-us/oneapi

Louie_T_Intel · ‎12-16-2019

Hi Chang,

thanks for trying out oneAPI AI analytics toolkit.

The Beta release is only for CPU, so there is no support from current oneAPI TF & PyTorch release for GPU.

Moreover, Intel DevCloud is for user trial by sharing computing resource, and it is not for performance measurement.

if you want to tune your performance over intel oneAPI AI toolkit, please use your dedicated machines or public cloud service like AWS.

For performance on CPU, we could try to help you tune your workload if you share more details of your workload.

I also attached a introduction article to test oneAPI for AI below.

https://software.intel.com/en-us/get-started-with-intel-oneapi-linux-get-started-with-the-intel-ai-analytics-toolkit

wish it may help.

ArunJ_Intel · ‎07-10-2020

We are closing this case as there has been no updates for long period. If you are having further queries please feel free to raise a new thread.

How to setup fastest running for Tensorflow?