Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

does GPU support int8 inference?

rongrong__wang
Beginner
1,551 Views

I see this document 

https://docs.openvinotoolkit.org/latest/_inference_engine_tools_calibration_tool_README.html

I use Simplified Mode to convert my own F32 IR model to int8。 I got the int8 IR model of the target device for CPU and GPU respectively. I do inference using int8 CPU IR model using CPU, and the inference time decrease. I do inference using int8 GPU IR model using GPU, and the inference time Inference time has not changed.

I see the GPU does not support int8 IR model in https://docs.openvinotoolkit.org/latest/_docs_IE_DG_Int8Inference.html. So, does the GPU really support the int8 inference?

In addition, I use the Simplified mode to generate the int8 iR model. The IR model generated in Simplified mode will only affect the inference accuracy? Does the inference time of the IR model generated in Simplified mode differ greatly from the Inference time of the IR model generated by step 1-4?

Thanks.

0 Kudos
4 Replies
HemanthKum_G_Intel
1,551 Views

Hi Wang,

Low-Precision 8-bit Integer Inference is a "preview feature" and optimized for CPU.

0 Kudos
rongrong__wang
Beginner
1,551 Views

Hemanth Kumar G. (Intel) wrote:

Hi Wang,

Low-Precision 8-bit Integer Inference is a "preview feature" and optimized for CPU.

Thank you.

Does this int8 IR Model generated by simplified mode only affect inference accuracy but does not affect Inference time?

0 Kudos
Shubha_R_Intel
Employee
1,551 Views

Dear rongrong, wang,

The calibration tools allow conversion to INT8 using a loss of accuracy which you can live with. It's really up to you, though of course there are recommended guidelines. The idea behind INT8 is that the model may detect perfectly well even with this loss of accuracy. And yes, INT8 is supposed to improve performance. There is no reason to run an FP32 model if INT8 does the job, for INT8 will likely run faster. Keep in mind though that INT8 is still somewhat restrictive - not all layers can be converted to INT8. The INT8 reference documentation provides detailed info.

Thanks,

Shubha

0 Kudos
rongrong__wang
Beginner
1,551 Views

Shubha R. (Intel) wrote:

Dear rongrong, wang,

The calibration tools allow conversion to INT8 using a loss of accuracy which you can live with. It's really up to you, though of course there are recommended guidelines. The idea behind INT8 is that the model may detect perfectly well even with this loss of accuracy. And yes, INT8 is supposed to improve performance. There is no reason to run an FP32 model if INT8 does the job, for INT8 will likely run faster. Keep in mind though that INT8 is still somewhat restrictive - not all layers can be converted to INT8. The INT8 reference documentation provides detailed info.

Thanks,

Shubha

Thank you very much! I understand.

0 Kudos
Reply