Intel and Google* have been collaborating to deliver optimized implementations of some of the most compute-intensive TensorFlow operations. TensorFlow is an end-to-end open-source machine learning platform. Operations such as convolution filters require large matrix multiplications, which are extremely compute-intensive. Intel® oneAPI Deep Neural Network Library (oneDNN) is an open-source cross-platform library that provides implementations of deep learning building blocks that use the same API for CPUs, GPUs, or both.
In this session:
- Penporn Koanantakook of Google explains some of the optimizations that have been implemented, such as operation fusion, primitive caching, and vectorization of int8 and bfloat16 data types.
- A live demo of Intel® Neural Compressor automatically quantizing a network to improve performance by 4x with a 0.06% accuracy loss.
- An overview of the PluggableDevice mechanism in TensorFlow, co-architected by Intel and Google to deliver a scalable way for developers to add new device support as plug-in packages.
Note that this presentation was current as of TensorFlow 2.8. Starting with TensorFlow 2.9, the oneDNN optimizations are on by default, no longer requiring the TF_ENABLE_ONEDNN_OPTS=1 variable setting.
Get the Software
Accelerate end-to-end machine learning and data science pipelines with optimized deep learning frameworks and high-performing Python* libraries in Intel® oneAPI AI Analytics Toolkit.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.