Artificial Intelligence (AI)
Discuss current events in AI and technological innovations with Intel® employees
491 Discussions

Getting started with classical Machine Learning Frameworks using Google Colaboratory

Intel_AI_Community
1 0 16.6K

Posted on behalf of Ramya Ravi, AI Software Marketing Engineer, Intel

Data Scientists and developers use Machine Learning (ML) frameworks to build and deploy various ML models easily and quickly. These frameworks reveal simpler Application Programming Interfaces (APIs) that convert to complex mathematical computations and numerical analysis often needed for training machine learning models in various languages such as Python and R. All major frameworks for machine learning have been optimized by using oneAPI libraries which issues top performance across Intel® architectures. These Intel software optimizations help bring best performance gains over stock implementations of the same frameworks. These machine learning frameworks can be installed in many ways. This blog aims to introduce uncomplicated way of installing major machine learning frameworks optimized by Intel on Google Colaboratory (Colab).

Getting Started with Google Colab

Colab lets anyone write and execute arbitrary Python code through the browser. Also, Colab is a free Jupyter notebook service that requires no setup to use, while providing access free of charge to computing resources including GPUs (Graphics Processing Units). If you have used Jupyter Notebooks, then you can easily adapt to Google Colab. Colab is based on Google Drive. Remember to log in to Google Drive account as the notebooks will be stored in Drive. Open the link to get started with Colab.

Creating New Notebook:

There are two ways to create a new notebook on Google Colab.

Method 1:

  1. Open Colab.
  2. Click New Notebook.

Method 2:

  1. Open Google Drive.
  2. Click New -> More -> Google Colaboratory.

Adjusting Hardware Accelerator:

The default hardware on Google Colab is CPU (Central Processing Unit). The hardware on Google Colab can be changed to either GPU or TPU (Tensor Processing Unit).

Method 1:

  1. Click Edit -> Notebook settings -> Hardware Accelerator -> GPU or TPU.

Method 2:

  1. Click Runtime -> Change runtime type -> Hardware Accelerator -> GPU or TPU.

Method 3:

  1. Click Connect -> View Resources -> Change runtime type -> Hardware Accelerator -> GPU or TPU.

Installing Libraries:

Most of the Python Libraries are pre-installed but new libraries or packages can be installed in two ways.

!pip install [package name]

OR

!apt-get install [package name]

Mounting Google Drive:

The code below can be used for mounting Google Drive.

from google.colab import drive
drive.mount('/content/drive')

Invoking System Commands:

System commands can be run by including ‘!’ before the command.

  • Command to clone a git repository: !git clone [git clone url]
  • Syntax for using directory commands: !ls, !mkdir
  • Command to download from web url: !wget [url] -p drive/[Folder Name]

These things will help you to get started with Google Colab in an efficient way.

Machine Learning Frameworks

Intel® Extension for Scikit-learn*: Scikit-learn is a simple and efficient Python package, which is useful for predictive data analysis and machine learning. Intel® Extension for Scikit-learn* improves performance of many algorithms of scikit-learn on Intel CPUs and GPUs. It attains the best performance for machine learning algorithms on Intel® architectures, both single and multi-nodes. Also, the recent version of Intel® Extension for Scikit-learn* is included as part of the Intel® AI Analytics Toolkit.

The command to install Intel® Extension for Scikit-learn* on Google colab is,

!pip install scikit-learn-intelex

Please open the notebook to get started with Intel® Extension for Scikit-learn* code. Before executing the example, install Intel® Extension for Scikit-learn* using the above command in the notebook.

Intel® Distribution of Modin*: Modin* is a replacement for pandas. This enables data scientists to focus more on data analysis without having to change API code. This distribution adds optimizations to accelerate processing on Intel hardware. Also, the recent version of Intel® Distribution of Modin* is included as part of the Intel® AI Analytics Toolkit.

Intel® Distribution of Modin* can be installed with pip on Google colab:

!pip install modin[all] #Install Modin with all Modin's currently supported engines (Recommended).

We can install Modin using specific engine, then the commands are:

!pip install modin[ray] #Install Modin dependencies and Ray.

!pip install modin[dask] #Install Modin dependencies and Dask.

This link provides a beginner's code sample for Intel® Distribution of Modin*. First install Intel® Distribution of Modin* using one of the above commands in Colab notebook and then execute the code.

XGBoost Optimized by Intel: XGBoost (Extreme Gradient Boosting), is an open-source machine learning library, which implements scalable, distributed gradient-boosted decision tree. This machine learning library is mainly used for regression, classification, and ranking problems. Intel has optimized to boost the performance for model training and to increase the accuracy with better predictions. Also, the recent version of XGBoost Optimized by Intel is included as part of the Intel® AI Analytics Toolkit.
XGBoost Optimized by Intel can be installed on Google Colab with pip:

!pip install xgboost==1.4.2

The link shows an introductory code sample for XGBoost Optimized by Intel. Before implementing the code in the notebook, install XGBoost Optimized by Intel using the above command.

What’s next?

We encourage you to learn more about and incorporate Intel’s other AI/ML Framework optimizations and end-to-end portfolio of tools into your AI workflow. Also, visit AI & ML page to learn about Intel’s AI software development resources to prepare, build, deploy and scale AI solutions.

Useful resources

We would like to thank Urszula Zofia Guminska for reviewing.

 

 

 

 

About the Author
Intel AI Publishing