Solved: please clarify status of pytorch support

TX__Vlad · ‎05-20-2020

Greetings,

Elsewhere in this forum I see claims that pytorch is supported only insofar as pytorch models exported to ONNX. Pytorch is also not listed as a in this supported frameworks list.

However, the recent (March 2020) preprint of Neural Network Compression Framework for fast model inference appears to give the impression that NNCF framework (which seems to be part of OpenVINO kit) supports Pytorch as "a part of OpenVINO Training Extension". The preprint says

NNCF is built on top of the popular PyTorch framework
...
As we show in Appendix A any existing training pipeline written on PyTorch can be easily adopted to support model compression using NNCF.

Indeed, the framework repo has an entire folder dedicated to Pytorch (https://github.com/opencv/openvino_training_extensions/tree/develop/pytorch_toolkit).

Could someone clarify the exact status of pytorch support in OpenVINO? It is Ok if the output from the model optimizer must be ONNX -- what matters to me is that the input could be a "native" pytorch model.

Thank you.

Munesh_Intel · ‎05-22-2020

Hi Vlad,

OpenVINO™ toolkit officially supports public PyTorch models (from torchvision 0.2.1 and pretrainedmodels 0.7.4 packages) via ONNX conversion. The list of supported topologies is available at the following page:

https://docs.openvinotoolkit.org/latest/_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_ONNX.html#supported_pytorch_models_via_onnx_conversion

On the other hand, OpenVINO™ Training Extensions (OTE) are intended to streamline the development of deep learning models and accelerate the time-to-inference. Neural Network Compression Framework (NNCF), which is a part of OTE, is a PyTorch-based framework that supports a wide range of Deep Learning models for various use cases. It also implements Quantization-Aware Training (QAT) supporting different quantization modes and settings.

NNCF supports various compression algorithms, including Quantization, Binarization, Sparsity, and Filter Pruning, applied during a model fine-tuning process to achieve best compression parameters and accuracy. When fine-tuning finishes, the accurate optimized model can be exported to ONNX format, which can then be used by Model Optimizer to generate Intermediate Representation (IR) files and subsequently inferred with OpenVINO™ Inference Engine.

The following article, ‘Enhanced Low-Precision Pipeline to Accelerate Inference with OpenVINO Toolkit’, contains more relevant information, and is available at the following link:

https://www.intel.com/content/www/us/en/artificial-intelligence/posts/open-vino-low-precision-pipeline.html

Additional information regarding common optimization flow with OpenVINO and relative tools is available at the following page, ’Low Precision Optimization Guide’, under the section ‘Model Optimization Flow’.

https://docs.openvinotoolkit.org/latest/_docs_LowPrecisionOptimizationGuide.html#model_optimization_flow

Regards,

Munesh

View solution in original post

Munesh_Intel · ‎05-22-2020