Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Using Custom Quantization Parameters

karlo1
Novice
1,294 Views

I'm trying to use the model optimizer to convert my float32 ONNX model to OpenVINO IR. My end goal is to get an INT8 quantized model.

The problem is that I don't want to obtain the quantization parameters from the OpenVINO QAT or PTQ. I want to set my own quantization parameters which I obtained from another framework.

Is it possible to do this and how can I accomplish it?

1 Solution
IntelSupport
Community Manager
1,120 Views

 

Hi Karlo1,

 

Thank you for your patience.

 

For your information, OpenVINO doesn’t have such features in either Model Optimizer or POT.

We would recommend two possible workarounds:

First:

  • Convert from ONNX to IR
  • Use POT or NNCF to quantize the model
  • Write a script that will read IR, integrate over the FakeQuantize nodes and change parameters

 

 Second:

  • Use NNCF to quantize ONNX
  • Write a script that will read quantized ONNX and change parameters of Q/DQ ops
  • Convert the model to IR using MO or convert_model API

 

 

Regards,

Aznie

 

View solution in original post

4 Replies
IntelSupport
Community Manager
1,250 Views

Hi Karlo1,

 

Thanks for reaching out.

 

OpenVINO offers two types of quantization which are Post-training Quantization with POT and Post-training Quantization with NNCF (new).The Post-training Quantization with NNC support PTQ API. Meanwhile, the POT supports the uniform integer quantization method. Any framework can be quantized by the POT as long as the models are in the OpenVINO Intermediate Representation (IR) format only. For that you need to convert your model to the IR format using Model Optimizer. You may refer to these examples to utilize the POT.

 

 

Regards,

Aznie


0 Kudos
karlo1
Novice
1,221 Views

Hi Aznie!

Thank you for your answer but I don't think you answered my question.

 

I am aware of the types of quantization OpenVINO offers. If I am not mistaken it also offers QAT as well.

As I said, I don't want my quantization parameters to be obtained from OpenVINO PTQ or QAT.

 

I already have a set of quantization parameters (scales and zero points for weights and activations) that I want to use and I was wondering if it is possible to tell OpenVINO to use custom quantization parameters and quantize the model with them.

The ideal solution to my problem is a command line option that allows me to specify the quantization parameters in some predefined format. One example would be to use the AIMET specification.

 

I am hoping to use one of the following workflows:

  1. Start with the ONNX model in fp32.
  2. Use model optimizer to get OpenVINO IR in int8 using my quantization parameters

or

  1. Start with the ONNX model in fp32.
  2. Use model optimizer to get OpenVINO IR in fp32.
  3. Use some other tool (perhaps PTO) to get an OpenVINO IR in int8 using my quantization parameters.

I am also open to other suggestions that involve using predefined quantization parameters.

 

Thank you for your help!

0 Kudos
IntelSupport
Community Manager
1,121 Views

 

Hi Karlo1,

 

Thank you for your patience.

 

For your information, OpenVINO doesn’t have such features in either Model Optimizer or POT.

We would recommend two possible workarounds:

First:

  • Convert from ONNX to IR
  • Use POT or NNCF to quantize the model
  • Write a script that will read IR, integrate over the FakeQuantize nodes and change parameters

 

 Second:

  • Use NNCF to quantize ONNX
  • Write a script that will read quantized ONNX and change parameters of Q/DQ ops
  • Convert the model to IR using MO or convert_model API

 

 

Regards,

Aznie

 

Aznie_Intel
Moderator
1,048 Views

Hi Karlo1,


This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.



Regards,

Aznie



0 Kudos
Reply