Accelerate Deep Learning with Intel® Optimization for TensorFlow*

Jack_Erickson · ‎08-11-2022

Intel and Google* have been collaborating to deliver optimized implementations of some of the most compute-intensive TensorFlow operations. TensorFlow is an end-to-end open-source machine learning platform. Operations such as convolution filters require large matrix multiplications, which are extremely compute-intensive. Intel® oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform library that provides implementations of deep learning building blocks that use the same API for CPUs, GPUs, or both.

In this session:

Penporn Koanantakook of Google explains some of the optimizations that have been implemented, such as operation fusion, primitive caching, and vectorization of int8 and bfloat16 data types.
A live demo of Intel® Neural Compressor automatically quantizing a network to improve performance by 4x with a 0.06% accuracy loss.
An overview of the PluggableDevice mechanism in TensorFlow, co-architected by Intel and Google to deliver a scalable way for developers to add new device support as plug-in packages.

Note that this presentation was current as of TensorFlow 2.8. Starting with TensorFlow 2.9, the oneDNN optimizations are on by default, no longer requiring the TF_ENABLE_ONEDNN_OPTS=1 variable setting.

Get the Software

Accelerate end-to-end machine learning and data science pipelines with optimized deep learning frameworks and high-performing Python* libraries in Intel® oneAPI AI Analytics Toolkit.