- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
have quantized onnx model (quantized by openVino framework)
In this case, data_type should be INT8?,
But, I can not find it in option of --data_type though there is "half" in choises accoring to --help...
Please tell me whihc one is proper one?
By the way, normal usage is FP32. If I set FP16 for data of FP32, what happens ??
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy,
For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).
If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.
Regards,
Peh
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy,
Model Optimizer can convert all floating-point weights to FP16 data type with the option --data_type FP16.
The resulting FP16 model will occupy about twice as less space in the file system, but it may have some accuracy drop, although for the majority of models accuracy degradation is negligible.
If the model was FP16 it will have FP16 precision in IR as well. Using --data_type FP32 will give no result and will not force FP32 precision in the model.
For the data type of the model to be INT8, you have to convert the FP32 or FP16 precision into INT8 by using OpenVINO Post-training Optimization Tool (POT).
Regards,
Peh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the explanation of the usage of --data_type"
>>For the data type of the model to be INT8, you have to ...
Yes, I converted my model using NNCF framework, and I have .onnx, which is quantized.
Then, I have to convert it to IR model using "mo" so that I use it in openVino inference engine.
When I convert it, which one I have to use for --data_type since --help tell me
--data_type {FP16,FP32,half,float}
Data type for all intermediate tensors and weights. If original model is in FP32
and --data_type=FP16 is specified, all model weights and biases are compressed to
FP16.
I can not see INT8 option here...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy,
For your information, you can directly load ONNX model into OpenVINO™ toolkit Inference Engine without converting into Intermediate Representation (IR).
If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type. The INT8 ONNX model differs from an FP32 ONNX model by the additional nodes specifying quantization in model. Hence, there are no additional Model Optimizer parameters are required to handle such models. The INT8 IR will be produced automatically if you supply an INT8 ONNX as input.
Regards,
Peh
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
>>If you want to convert the INT8 ONNX model into IR, just convert without specifying the data_type.
oh, I see I was converting INT8 model with the data_type option "FP16", which might lead some problem...
I'll test some configurations to see what happens for accuracy and inference speed, and close this post after I understand its behaviour
FP32 model w/ data_type FP32
FP32 model w/ data_type FP16
INT8 model w/ data_type FP16
INT8 model w/ data_type not spacified.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi timosy,
This thread will no longer be monitored since this issue has been resolved. If you need any additional information from Intel, please submit a new question.
Regards,
Peh

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page