Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

option of mo.py "--data_type FP16 "

timosy
New Contributor I
1,962 Views

 have quantized onnx model (quantized by openVino framework)

 

In this case, data_type should be INT8?,   

But, I can not find it in option of --data_type though there is "half" in choises accoring to --help...

Please tell me whihc one is proper one?

 

By the way, normal usage is FP32. If I set FP16 for data of FP32, what happens ?? 

Labels (3)
0 Kudos
1 Solution
Peh_Intel
Moderator
1,908 Views

Hi timosy,

 

For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).

 

If you want to convert the INT8 ONNX model into IR, just convert without specifying the data­_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.

 

 

Regards,

Peh

 

View solution in original post

0 Kudos
5 Replies
Peh_Intel
Moderator
1,942 Views

Hi timosy,

 

Model Optimizer can convert all floating-point weights to FP16 data type with the option --data_type FP16.

 

The resulting FP16 model will occupy about twice as less space in the file system, but it may have some accuracy drop, although for the majority of models accuracy degradation is negligible.

 

If the model was FP16 it will have FP16 precision in IR as well. Using --data_type FP32 will give no result and will not force FP32 precision in the model.

 

For the data type of the model to be INT8, you have to convert the FP32 or FP16 precision into INT8 by using OpenVINO Post-training Optimization Tool (POT).

 

 

Regards,

Peh

 

0 Kudos
timosy
New Contributor I
1,936 Views

Thanks for the explanation of the usage of --data_type"

 

>>For the data type of the model to be INT8, you have to ...

Yes, I converted my model  using NNCF framework, and I have .onnx, which is quantized.

Then, I have to convert it to IR model using "mo" so that I use it in openVino inference engine.

When I convert it, which one I have to use for --data_type since --help tell me

 

--data_type {FP16,FP32,half,float}
     Data type for all intermediate tensors and weights. If original model is in FP32
     and --data_type=FP16 is specified, all model weights and biases are compressed to
     FP16.

I can not see INT8 option here...

0 Kudos
Peh_Intel
Moderator
1,909 Views

Hi timosy,

 

For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).

 

If you want to convert the INT8 ONNX model into IR, just convert without specifying the data­_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.

 

 

Regards,

Peh

 

0 Kudos
timosy
New Contributor I
1,903 Views

>>If you want to convert the INT8 ONNX model into IR, just convert without specifying the data­_type.

oh, I see I was converting INT8 model with the data_type option "FP16", which might lead some problem...

 

I'll test some configurations to see what happens for accuracy and inference speed, and close this post after I understand its behaviour

FP32 model w/ data_type FP32

FP32 model w/ data_type FP16

INT8  model w/ data_type FP16

INT8 model w/ data_type not spacified.

 

 

0 Kudos
Peh_Intel
Moderator
1,874 Views

Hi timosy,


This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question. 



Regards,

Peh


0 Kudos
Reply