Question

I have a trained onnx model that needs to be quantized to INT8. But I want my last fully connected layers are still in FP32 or FP16. So how can I choose specific layers to quantize (or not to quantize)?

PS when I was working with NNCF, I just use parameter ignored_scopes. Maybe is there something similar here at Workbench?

score 0 · Answer 1

342 Views

Hi Anvarganiev07,

Thank you for reaching out to us.

Referring to Optional parameters in DefaultQuantization Parameter, you may use "ignored" parameter to exclude nodes or operation types from optimization. For example, you may refer to the following links:

Hope it helps.

Regards,

Wan

Copy link

score 0 · Answer 2

320 Views

Hi Anvarganiev07,

Thanks for your question.

This thread will no longer be monitored since we have provided a suggestion.

If you need any additional information from Intel, please submit a new question.

Regards,

Wan

Copy link

How to quantize specific layers at OpenVINO Workbench?