Solved: option of mo.py "--data_type FP16 "

timosy · ‎06-22-2022

have quantized onnx model (quantized by openVino framework)

In this case, data_type should be INT8?,

But, I can not find it in option of --data_type though there is "half" in choises accoring to --help...

Please tell me whihc one is proper one?

By the way, normal usage is FP32. If I set FP16 for data of FP32, what happens ??

Peh_Intel · ‎06-27-2022

Hi timosy,

For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).

If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.

Regards,

Peh

View solution in original post

Peh_Intel · ‎06-23-2022

Hi timosy,

Model Optimizer can convert all floating-point weights to FP16 data type with the option --data_type FP16.

The resulting FP16 model will occupy about twice as less space in the file system, but it may have some accuracy drop, although for the majority of models accuracy degradation is negligible.

If the model was FP16 it will have FP16 precision in IR as well. Using --data_type FP32 will give no result and will not force FP32 precision in the model.

For the data type of the model to be INT8, you have to convert the FP32 or FP16 precision into INT8 by using OpenVINO Post-training Optimization Tool (POT).

Regards,

Peh

timosy · ‎06-24-2022

Thanks for the explanation of the usage of --data_type"

>>For the data type of the model to be INT8, you have to ...

Yes, I converted my model using NNCF framework, and I have .onnx, which is quantized.

Then, I have to convert it to IR model using "mo" so that I use it in openVino inference engine.

When I convert it, which one I have to use for --data_type since --help tell me

--data_type {FP16,FP32,half,float}
Data type for all intermediate tensors and weights. If original model is in FP32
and --data_type=FP16 is specified, all model weights and biases are compressed to
FP16.

I can not see INT8 option here...

Peh_Intel · ‎06-27-2022

Hi timosy,

For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).

If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.

Regards,

Peh

timosy · ‎06-27-2022

>>If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type.

oh, I see I was converting INT8 model with the data_type option "FP16", which might lead some problem...

I'll test some configurations to see what happens for accuracy and inference speed, and close this post after I understand its behaviour

FP32 model w/ data_type FP32

FP32 model w/ data_type FP16

INT8 model w/ data_type FP16

INT8 model w/ data_type not spacified.

Peh_Intel · ‎06-30-2022

Hi timosy,

This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.

Regards,

Peh

option of mo.py "--data_type FP16 "

Benchmarking

Code Samples

Inference Engine