The No-Code Approach to Deploying Deep Learning Models on Intel® Hardware A three-part series on OpenVINO™ Deep Learning Workbench

MaryT_Intel · ‎09-14-2021

A three-part series on OpenVINO™ Deep Learning Workbench

About the Series

Learn how to convert, fine-tune, and package an inference-ready TensorFlow model, optimized for Intel® hardware, using nothing but a web browser. Every step happens in the cloud using OpenVINO™ Deep Learning Workbench and Intel® DevCloud for the Edge.
Part One: We show you the Intel® deep learning inference tools and the basics of how they work.
Part Two (you’re here!): We go through each step of importing, converting, and benchmarking a TensorFlow model using OpenVINO Deep Learning Workbench.
Part Three: We show you how to change precision levels for increased performance and export a ready-to-run inference package with just a browser using OpenVINO Deep Learning Workbench.

Part Two: Import, Convert, and Benchmark a TensorFlow Model on Intel Hardware with OpenVINO Deep Learning Workbench

In Part One, we explained the basics of deep learning inference and how it works on Intel hardware. Then we showed you how to get a free Intel® DevCloud for the Edge account so you could start converting and optimizing inference models online using the OpenVINO Deep Learning Workbench.

Here in Part Two, we’re going to walk you through creating an optimized inference model for Intel hardware from a TensorFlow model and running your first benchmark.

If you haven’t already, please sign up for DevCloud for the Edge. It’s free, and it only takes a few minutes. Once you have an account, sign in to DevCloud and launch OpenVINO Deep Learning Workbench.

Step one: Create a configuration and import a TensorFlow model

To start, create a new configuration by clicking Create. You can create as many configurations as you want—with different models, optimizations, and target hardware.

To start, click Create on the Deep Learning Workbench start page.

Next, we’ll import our model. You can upload your own model or get one from our Open Model Zoo. For this demo, let’s import a COCO-based object detection model from the Open Model Zoo.

You can import your own model or use one from our Open Model Zoo.

Step two: Convert a TensorFlow model to an Intel® IR file

OpenVINO Deep Learning Workbench (using the Model Optimizer) prunes, merges layers, and converts models from multiple frameworks into intermediate files (IR files) that can be used with any Intel hardware. IR files include an .xml file with the model’s topology and a binary (.bin) file with weights and values.

To convert, all we have to do is choose our precision level: A GPU can run FP32 or FP16. Intel® Movidius™ VPUs run at FP16. CPUs run at FP32. To keep it simple, we’re going to run this demo on a CPU, so we’ll choose FP32.

Choose FP32 or FP16 and convert.

Step three: Pick a target device in the DevCloud

Since we’re running the workbench through DevCloud, we have access to every Intel® processor’s architecture, Intel Atom®, Intel® Core™, Intel® Xeon®, and Intel® Movidius™ VPUs.

Experiment with Intel® CPUs, integrated GPUs, VPUs, and FPGAs.

Let’s choose an Intel® Xeon® 6258R (aka Cascade Lake). It has Intel® Deep Learning Boost, which will allow us to run our model in FP32 and INT8 for even faster inference processing.

Intel® Xeon® 6258R has Intel® Deep Learning Boost so we can test INT8 performance.

Step four: Get a data set

Now that we have a model that’s ready to run, we need a data set for benchmarking. You can upload your own data set or generate a random data set. The data set can either be annotated or not annotated.

If measuring model accuracy is important, you’ll need an annotated data set. If you care more about the performance or you want to perform default calibration, then you can work without annotation. For this example, we will use the COCO data set from GitHub without annotations.

Be sure to check the “Validation Dataset Tip.” It tells you how to structure the files and archive them for upload.

Note: You can also run Deep Learning Workbench on a local machine. The local and DevCloud versions have slightly different capabilities. You can learn about the differencesDl Workbench capabilities on a local machine in the OpenVINO™ toolkit documentation.

Be sure to follow the directory structure outlined in Validation Dataset Tip.

Step five: Benchmark the model

Now we start to see where Deep Learning Workbench really shines. Our first test run created a single benchmark with tons of information. We can see the nGraph and the runtime graph and unpack execution times for each layer.

Each test run creates a benchmark with extensive layer-by-layer performance data.

We can run more tests, including group inference with different batch sizes and streams, and get a broader picture of how our model performs.

Deep Learning Workbench benchmarks provide extensive performance data.

Each test you run creates a new benchmark. Drag the performance curve to your target benchmark and Workbench will tune the model to hit it.

Step six: Pick your performance point

Each run we do creates a benchmark on the performance graph that plots latency—the time it takes to run inference—against throughput and the number of images (in this case) analyzed per second.

We can pick any balance of throughput and processing speed; all we have to do is point and click. Workbench will automatically create an inferencing profile that hits that mark.

Next up - Part Three: Analyze, Optimize, and Package a TensorFlow Model for Deployment on Intel Architecture

In our next post, we’ll explore some of the advanced tools in OpenVINO Deep Learning Workbench, convert a model to INT8, and show you how easy it is to package a production-ready application.

Learn more:

Sign up for DevCloud

Notices & Disclaimers

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.

Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details. No product or component can be absolutely secure.

Your costs and results may vary.

Intel technologies may require enabled hardware, software or service activation.

Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy.

The No-Code Approach to Deploying Deep Learning Models on Intel® Hardware A three-part series on OpenVINO™ Deep Learning Workbench

About the Series

Part Two: Import, Convert, and Benchmark a TensorFlow Model on Intel Hardware with OpenVINO Deep Learning Workbench

UPDATED 04/27/2022 - Updated several links