- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi, I tried compressing a model using POT, with default quantization and got the error above. I understand that Int8 is not supported by VPU devices yet. Also, I have tried different methods such as converting the model to FP16 and reducing the input shape.
Is there any other alternative to compress the model?
Regards,
Simardeep Singh Sethi
コピーされたリンク
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Simardeep,
Thanks for reaching out.
Firstly, the "unsupported layer type "FakeQuantize" is obviously because VPU plugin doesn't support INT8 model format. The relevant information is available in Supported Model Format documentation.
Meanwhile, In OpenVINO, there are 2 ways to enhance the performance:
- During development: Post-training Optimization tool (POT), Neural Network Compression Framework (NNCF), Model Optimizer
- During deployment: tuning inference parameters and optimizing model execution
It's also possible to combine both approaches.
Since you have tried with POT, another alternative way is to retrain the model with NNCF. Refer to Introducing a Training Add-on for OpenVINO toolkit: Neural Network Compression Framework, for the steps to implement the NNCF optimization methods using supported training samples and through integration into the custom training code.
The training samples are available at this GitHub repository.
Regards,
Aznie
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Simardeep,
This thread will no longer be monitored since we have provided information. If you need any additional information from Intel, please submit a new question.
Regards,
Aznie
