- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I'm checking Quantization aware training in openVino, and I found two tutorials :
1). Post-Training Quantization of PyTorch models with NNCF
2). Quantization Aware Training with NNCF, using PyTorch framework
As for the 2nd one, I though that training is done by sandwiching layers w/
"Quantize" layer and "DeQuantize" layer as Pytorch does.
But it seems that QWT mentioned 2) is actully fine tuning
(just tune the parameters after the quantization).
So, openVino does not have QAT w/ Quantize and DeQuantize later during the training ?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hep77to,
Thanks for reaching out to us.
For your information, the goal of Quantization Aware Training with NNCF, using PyTorch framework is to demonstrate how to use the Neural Network Compression Framework (NNCF) 8-bit quantization to optimize a PyTorch model for inference with OpenVINO™ toolkit. The optimization process contains the following steps:
· Transform the original FP32 model to INT8
· Use fine-tuning to restore the accuracy
· Export optimized and original models to ONNX and then to OpenVINO IR
· Measure and compare the performance of models
On another note, Post-Training Quantization of PyTorch models with NNCF demonstrate how to use the NNCF 8-bit quantization in post-training mode, without the fine-tuning pipeline to optimize a PyTorch model for the high-speed inference via OpenVINO™ toolkit. The optimization process contains the following steps:
· Evaluate the original model
· Transform the original model to a quantized one
· Export optimized and original models to ONNX and then to OpenVINO IR
· Compare performance of the obtained FP32 and INT8 models
Regards,
Wan
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Hep77to,
Thanks for reaching out to us.
For your information, the goal of Quantization Aware Training with NNCF, using PyTorch framework is to demonstrate how to use the Neural Network Compression Framework (NNCF) 8-bit quantization to optimize a PyTorch model for inference with OpenVINO™ toolkit. The optimization process contains the following steps:
· Transform the original FP32 model to INT8
· Use fine-tuning to restore the accuracy
· Export optimized and original models to ONNX and then to OpenVINO IR
· Measure and compare the performance of models
On another note, Post-Training Quantization of PyTorch models with NNCF demonstrate how to use the NNCF 8-bit quantization in post-training mode, without the fine-tuning pipeline to optimize a PyTorch model for the high-speed inference via OpenVINO™ toolkit. The optimization process contains the following steps:
· Evaluate the original model
· Transform the original model to a quantized one
· Export optimized and original models to ONNX and then to OpenVINO IR
· Compare performance of the obtained FP32 and INT8 models
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the comments, and sorry for my reply.
I understand three kinds of quantization:
- Post-training dynamic range quantization
- Post-training integer (static) quantization
= your 2nd exsample
- Quantization-aware training
= fine-tuning, your 1st exsample
Are there any methods to do the 1st one in openVino?
meaning Post-training dynamic range quantization.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Timosy,
For your information, Post-training dynamic range quantization is only available at TensorFlow, and it is not available in the OpenVINO™ toolkit. The available options for optimization in the OpenVINO™ toolkit are available here. This thread will no longer be monitored since we have provided suggestions. If you need any additional information from Intel, please submit a new question.
Regards,
Wan
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page