Artificial Intelligence (AI)
Engage with our experts on topics in AI
291 Discussions

Reduce Binary Size with New Conditional Compilation in Intel® Distribution of OpenVINO™ toolkit

MaryT_Intel
Community Manager
0 0 541

Introduction

Deep Learning technology is evolving at an unprecedented pace—more and more algorithms appear and achieve state-of-the-art results on well-known and new benchmarks. The main reasons for that are new training techniques and model architectures produced by research engineers. This results in a constantly evolving set of operations that are required to represent a wide spectrum of topologies. The latest operation set from OpenVINO™ toolkit supports more than 150 operations, and we constantly evolve it to cover a broad set of models.

Meanwhile, the individual application only requires a few models to implement logic. In simple Computer Vision pipelines, it could be object detection plus classification (e.g., MobileNet-SSD plus MobileNet, respectively) which needs 19 operations. For natural language processing-powered applications, it could be just one model (e.g., BERT), requiring 22 operations. In summary, in cases where application size matters (distribution mechanism restrictions, etc.), it does not make sense to include all operations, and the desire to exclude ones that are not used is evident.

To help you with that, with the 2021.3 release of Intel® Distribution of OpenVINO™ toolkit, we provide the capability that allows compiling binaries from source code for specific models that are used within an application. This will result in parts of the functionality being excluded, but also allows for substantial savings in binary size. Depending on the scenario, three to four times reduction in binary size is possible.

Let’s walk through how the functionality works and what the potential benefits are.
 

Collecting Profiling Information and Compiling Tuned Version

The OpenVINO™ toolkit code is analyzed using Instrumentation and Tracing Technology (ITT) counters, which provides detailed profiling for developers that are interested in execution insights. This is low-level and detailed information, compared to per-layer statistics that are provided by Workbench for example. It also allows you to see what primitives are used throughout the application lifecycle and which can be removed.

The Conditional Compilation consists of two steps:

  1. Collect information about used blocks of code (kernels, primitives, optimization passes, etc.) using any applications running under ITT data collection process or configuration. One or multiple .csv files are generated as a result of this step.
  2. Build a custom runtime using an auto-generated header. This header file is created based on collected .csv files and enables only blocks of code which are needed to perform inference for scenarios on the previous step. 
     

More technical information about how to use Conditional Compilation feature can be found in the Wiki page.
 

Some examples of Conditional Compilation benefits

Binary size for OpenVINO™ custom runtimes for few models, and their combinations and comparison with full OpenVINO™ runtime for CPU device (takes 38.16 MBs):
 

openvino chart

The table above is prepared for release build + LTO flags using gcc 5.5.0 compiler. OpenVINO™ toolkit runtime for CPU device includes inference_engine, inference_engine_legacy, inference_engine_lp_transformations, inference_engine_transformations, ngraph and MKLDNNPlugin libraries.
 

OpenVINO™ toolkit components which support Conditional Compilation

For 2021.3 release the Conditional Compilation feature is enabled for the following OpenVINO components:

  • nGraph Library
  • Inference Engine transformations library
  • CPU plugin

It means that currently (in 2021.3 release) most of the benefits of Conditional Compilation are available for CPU execution, with more platforms coming soon in addition to our regular work to reduce binary size for generic release of OpenVINO™.
 

Conclusions

OpenVINO™ toolkit provides functionality of conditional compilation that allows developer to strip unused parts of runtime and reduce binary footprint by compiling of the DL libraries for individual needs of application, which is critical for certain scenarios. 

 

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.
No product or component can be absolutely secure. 
Your costs and results may vary. 
© Intel Corporation.  Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.  Other names and brands may be claimed as the property of others.   
 

About the Author
Mary is the Community Manager for this site. She likes to bike, and do college and career coaching for high school students in her spare time.