- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
I'm trying to use the model optimizer to convert my float32 ONNX model to OpenVINO IR. My end goal is to get an INT8 quantized model.
The problem is that I don't want to obtain the quantization parameters from the OpenVINO QAT or PTQ. I want to set my own quantization parameters which I obtained from another framework.
Is it possible to do this and how can I accomplish it?
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Karlo1,
Thank you for your patience.
For your information, OpenVINO doesn’t have such features in either Model Optimizer or POT.
We would recommend two possible workarounds:
First:
- Convert from ONNX to IR
- Use POT or NNCF to quantize the model
- Write a script that will read IR, integrate over the FakeQuantize nodes and change parameters
Second:
- Use NNCF to quantize ONNX
- Write a script that will read quantized ONNX and change parameters of Q/DQ ops
- Convert the model to IR using MO or convert_model API
Regards,
Aznie
コピーされたリンク
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Karlo1,
Thanks for reaching out.
OpenVINO offers two types of quantization which are Post-training Quantization with POT and Post-training Quantization with NNCF (new).The Post-training Quantization with NNC support PTQ API. Meanwhile, the POT supports the uniform integer quantization method. Any framework can be quantized by the POT as long as the models are in the OpenVINO Intermediate Representation (IR) format only. For that you need to convert your model to the IR format using Model Optimizer. You may refer to these examples to utilize the POT.
Regards,
Aznie
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Aznie!
Thank you for your answer but I don't think you answered my question.
I am aware of the types of quantization OpenVINO offers. If I am not mistaken it also offers QAT as well.
As I said, I don't want my quantization parameters to be obtained from OpenVINO PTQ or QAT.
I already have a set of quantization parameters (scales and zero points for weights and activations) that I want to use and I was wondering if it is possible to tell OpenVINO to use custom quantization parameters and quantize the model with them.
The ideal solution to my problem is a command line option that allows me to specify the quantization parameters in some predefined format. One example would be to use the AIMET specification.
I am hoping to use one of the following workflows:
- Start with the ONNX model in fp32.
- Use model optimizer to get OpenVINO IR in int8 using my quantization parameters
or
- Start with the ONNX model in fp32.
- Use model optimizer to get OpenVINO IR in fp32.
- Use some other tool (perhaps PTO) to get an OpenVINO IR in int8 using my quantization parameters.
I am also open to other suggestions that involve using predefined quantization parameters.
Thank you for your help!
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Karlo1,
Thank you for your patience.
For your information, OpenVINO doesn’t have such features in either Model Optimizer or POT.
We would recommend two possible workarounds:
First:
- Convert from ONNX to IR
- Use POT or NNCF to quantize the model
- Write a script that will read IR, integrate over the FakeQuantize nodes and change parameters
Second:
- Use NNCF to quantize ONNX
- Write a script that will read quantized ONNX and change parameters of Q/DQ ops
- Convert the model to IR using MO or convert_model API
Regards,
Aznie
- 新着としてマーク
- ブックマーク
- 購読
- ミュート
- RSS フィードを購読する
- ハイライト
- 印刷
- 不適切なコンテンツを報告
Hi Karlo1,
This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.
Regards,
Aznie