unsupported layer type "FakeQuantize"

Simardeep · ‎11-12-2022

Hi, I tried compressing a model using POT, with default quantization and got the error above. I understand that Int8 is not supported by VPU devices yet. Also, I have tried different methods such as converting the model to FP16 and reducing the input shape.

Is there any other alternative to compress the model?

Regards,

Simardeep Singh Sethi

IntelSupport · ‎11-14-2022

Hi Simardeep,

Thanks for reaching out.

Firstly, the "unsupported layer type "FakeQuantize" is obviously because VPU plugin doesn't support INT8 model format. The relevant information is available in Supported Model Format documentation.

Meanwhile, In OpenVINO, there are 2 ways to enhance the performance:

During development: Post-training Optimization tool (POT), Neural Network Compression Framework (NNCF), Model Optimizer
During deployment: tuning inference parameters and optimizing model execution

It's also possible to combine both approaches.

Since you have tried with POT, another alternative way is to retrain the model with NNCF. Refer to Introducing a Training Add-on for OpenVINO toolkit: Neural Network Compression Framework, for the steps to implement the NNCF optimization methods using supported training samples and through integration into the custom training code.

The training samples are available at this GitHub repository.

Regards,

Aznie

IntelSupport · ‎12-08-2022

Hi Simardeep,

This thread will no longer be monitored since we have provided information. If you need any additional information from Intel, please submit a new question.

Regards,

Aznie