Solved: Failed to convert Neural Compressor quantized INT8 TF model to onnx

Anand_Viswanath · ‎02-16-2022

Hi,

I have quantized a frozen FP32 model to INT8 model using Neural compressor. I am trying to convert these models to onnx. I am able to convert FP32 model to onnx using tf2onnx.convert , but the conversion fails for quantized INT8 model. Any help would be much appreciated. Thank you.

Error Trace:

2021-09-26 10:46:32,113 - INFO - Using tensorflow=2.7.0, onnx=1.10.2, tf2onnx=1.9.3/1190aa
2021-09-26 10:46:32,114 - INFO - Using opset <onnx, 9>
Traceback (most recent call last):
File "/root/anaconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/convert.py", line 633, in <module>
main()
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/convert.py", line 264, in main
model_proto, _ = _convert_common(
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/convert.py", line 162, in _convert_common
g = process_tf_graph(tf_graph, const_node_values=const_node_values,
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 433, in process_tf_graph
main_g, subgraphs = graphs_from_tf(tf_graph, input_names, output_names, shape_override, const_node_values,
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/tfonnx.py", line 448, in graphs_from_tf
ordered_func = resolve_functions(tf_graph)
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/tf_loader.py", line 759, in resolve_functions
_, _, _, _, _, functions = tflist_to_onnx(tf_graph, {})
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/tf_utils.py", line 416, in tflist_to_onnx
dtypes[out.name] = map_tf_dtype(out.dtype)
File "/root/anaconda3/lib/python3.8/site-packages/tf2onnx/tf_utils.py", line 112, in map_tf_dtype
dtype = TF_TO_ONNX_DTYPE[dtype]
KeyError: tf.qint8

Rahila_T_Intel · ‎02-18-2022

Hi,

Thank you for posting in Intel Communities.

TF2ONNX was built to translate TensorFlow models to ONNX. To convert a TensorFlow model (frozen graph *.pb, SavedModel or whatever) to ONNX you can try tf2onnx.
TF2ONNX does have a limitation such as no support for quantization, which is mentioned in the below github link.
https://github.com/onnx/tensorflow-onnx/issues/686
You can convert your TensorFlow model to ONNX and then try to convert it to int8.
Or else you have another option like TFLite2ONNX. It is created to convert TFLite models to ONNX. As of v0.3, TFLite2ONNX is compatible with TensorFlow 2.0 and quantization conversion.

To install via pip: pip install tflite2onnx.

Hope our response would clarify your doubts!

Thanks

View solution in original post

Rahila_T_Intel · ‎02-18-2022

Hi,

Thank you for posting in Intel Communities.

TF2ONNX was built to translate TensorFlow models to ONNX. To convert a TensorFlow model (frozen graph *.pb, SavedModel or whatever) to ONNX you can try tf2onnx.
TF2ONNX does have a limitation such as no support for quantization, which is mentioned in the below github link.
https://github.com/onnx/tensorflow-onnx/issues/686
You can convert your TensorFlow model to ONNX and then try to convert it to int8.
Or else you have another option like TFLite2ONNX. It is created to convert TFLite models to ONNX. As of v0.3, TFLite2ONNX is compatible with TensorFlow 2.0 and quantization conversion.

To install via pip: pip install tflite2onnx.

Hope our response would clarify your doubts!

Thanks

Anand_Viswanath · ‎02-21-2022

Hi,

Thank you for your response. I was able to convert TF FP32 model to ONNX using tf2onnx and then quantize the model using onnx quantization and Intel neural compressor.

Regards,

Anand

Abhishek81 · ‎05-23-2022

Can you share more in-depth details on how the issue was resolved?It might be useful for me too.

Rahila_T_Intel · ‎02-21-2022

Hi,

Glad to know that your issue is resolved. If you need any additional information, please post a new question as this thread will no longer be monitored by Intel

Thanks

Rahila T