Artificial Intelligence (AI)
Discuss current events in AI and technological innovations with Intel® employees
485 Discussions

Intel Labs Enables AI Innovation with Hardware-Aware Automated Machine-Learning Tools

Nilesh_Jain
Employee
0 0 4,962

Published February 1, 2022

Nilesh Jain is a Principal Engineer leading intelligent infrastructure and systems research at Intel Labs with a focus on visual/AI applications.

 

Highlights:

  • Automated machine learning (AutoML) enables the automation of AI algorithm design to reduce design time for efficient AI and increase time to market.
  • Intel Labs' hardware-aware AI-based automation tools (AutoX) enable rapid adoption of AI models for different deployments to address productivity challenges. 
  •  Intel Labs is developing a roadmap of AutoX technologies including an automated quantization of AI models (AutoQ), an automated hardware-aware model optimization tool using neural network architecture search (BootstrapNAS), and many more to rapidly scale efficient AI development and deployment.

The exponential growth of AI in every industry, from social media to drug discovery, has created a scaling challenge. Specifically, there is a need to design AI algorithms that map to disparate underlying AI platforms that can deploy efficiently and operate optimally. The adoption of automated machine learning (AutoML) is gaining momentum as major industry players implement automated solutions for every stage of development through to deployment. However, current AutoML technology only addresses half of the problem because it focuses only on automating the design of AI algorithms.

In addition, these types of scaling challenges require specialized developer skills, and there is currently an industry-wide shortage. More data scientists are needed to design algorithms, and more AI programmers are needed to optimize algorithms for deployment. Adoption of AI-based automation techniques will not only help democratize access to ML but can also help address AI scaling challenges. 

Hardware-aware AI automation enables rapid adoption of AI models for different deployments from mobile devices to client to cloud computing, simply by changing the target platform as input to the automated model design and optimization. Introducing hardware-aware AI-based automation tools (AutoX) help to improve the productivity by significantly cutting down data-to-deployment time and improving the efficiency of AI models. Our methods bring optimization closer to data scientists, and as a result, the iterative process of design and optimization is reduced significantly, as shown in Figure 1.

11071663-EDBF-4BC0-AB3F-4152CA971FED.png

Figure 1. Hardware-aware AI automation transforms AI design and optimizations

 

We believe these kinds of automation technologies will open opportunities to accelerate the discovery of new AI software and hardware features and improve the performance of AI. Rapid development of efficient tiny AI models will further accelerate the adoption of AI in Internet of Things (IOT), edge and embedded applications where compute resources are often limited.

 Intel Labs has several focused efforts underway to realize this vision. Specifically, we have developed capabilities, such as the release of AutoQ, an automated mixed-precision quantization model available today, and the upcoming release of the automated hardware-aware model optimization tool called BootstrapNAS.

AutoQ Automated Mixed-Precision Quantization Tool

There is a range of optimizations that can be performed on an AI model to improve performance such as compute parallelism, locality, compute placement, and bandwidth (BW) optimizations. One such optimization is quantization, which helps reduce model size and BW requirements and provides compute efficiency. Mixed-precision quantization is a promising method for achieving increased power, performance, and significant reduction of model size and memory footprint compared to uniform 8-bit quantization.

To address the challenges, Intel Labs released AutoQ as part of the Intel® Distribution of OpenVINO™ toolchain. AutoQ technology employs AutoML algorithms to assign precision for different layers of the deep neural network (DNN) to minimize accuracy degradation, maximize performance, significantly improve productivity, and alleviate the reliance on programming experts.

AutoQ is open-sourced and available as part of OpenVINO/NNCF – a model compression framework. More results can be found in the blogs, "Automated Mixed-Precision Quantization for Next-Generation Inference Hardware" andSigOpt featuring AutoQ.

BootstrapNAS Automated Hardware-Aware Model Optimization Tool

As part of a greater effort to simplify the optimization of AI models on Intel hardware—including Intel® Xeon® Scalable processors, which deliver built-in AI acceleration and flexibility—we are developing an automated hardware-aware model optimization tool called BootstrapNAS. The tool will provide significant time savings in finding an optimal model design for a given AI platform, and improve AI models' performance significantly.

Neural Architecture Search (NAS) solutions are algorithms that automate the design of artificial neural networks (ANN) by training super-networks and extracting optimal subnetworks. However, designing super-networks for any AI model design proves challenging. The BootstrapNAS tool takes a pre-trained model from a popular architecture (for example, ResNet50) or from a custom design, automatically creates a super-network, then uses state-of-the-art NAS techniques to train the super-network. The resulting subnetworks significantly outperform the given pre-trained model.

Virtually any AI model can be automated and optimized for an Intel Xeon processor-based platform using BootstrapNAS. Our approach helps scale NAS techniques to various models, as developers don’t need to invent design space for AI models. With BootstrapNAS, virtually any AI model can be automated and optimized for an Intel Xeon processor-based platform. The Bootstrap NAS tool will be open-sourced soon!

Future Research Direction

As part of our broader vision, researchers at Intel Labs are focusing on improving the efficiency of AI automation methods with techniques to make AI models more efficient. The goals are to develop automation tools, develop algorithmic techniques such as sample-efficient, multiple objective search, and extend the capabilities of joint AI accelerator-algorithm optimization. For example, our recent work on Zero-Shot NASintroduces neural architecture scoring metrics (NASM) to identify good NN designs without training them.

While applying Zero-Shot NAS requires no training resources, Intel Labs has identified a lack of NASMs that generalize well across neural architecture design spaces. Techniques like Zero-shot NAS will help reduce the search space of efficient solutions and significantly cut down the time required to find an efficient AI model.

We believe that automated approaches for co-designing NN architectures and neural accelerators are needed to maximize AI system efficiency and address productivity challenges. There has been growing interest in differentiable NN-hardware co-design to enable the joint optimization of NNs and neural accelerators. To enable efficient and realizable co-design of configurable hardware accelerators with arbitrary NN search spaces, Intel Labs has created realizable hardware and neural architecture (RHNAS). RHNAS is a method that combines reinforcement learning for hardware optimization with differentiable NAS.

Conclusion

AutoML tools allow for more efficient and accurate development and application of ML models by eliminating the need for specialized data scientists to hand-code models. Hardware-aware AutoML solutions, including AutoQ and BootstrapNAS, can dramatically reduce implementation times of AI systems and enable the ability to quickly scale and operate efficiently across a wide variety of platforms.

In addition, approaches such as Zero-Shot NAS and RHNAS enable faster and more efficient integration and training of NNs. Intel Labs is using these technologies to reduce design time for efficient AI and increase time to market, and ultimately fuel future AI innovation.  

Tags (1)
About the Author
Nilesh is a Principal Engineer at Intel Labs where he leads the Emerging Visual/AI Systems Research Lab. He focuses on developing innovative technologies for edge/cloud systems for emerging workloads. His current research interests include visual computing, hardware aware AutoML systems. He received an M.Sc. degree from Oregon Graduate Institute/OHSU. He is also a Sr. IEEE member and has published over 15 papers and over 20 patents.