openvino加速模型由FP32转为int8性能反而降低

yujf · ‎08-12-2022

按照官网给出的性能对比，模型由FP32转为int8模型性能应该加快两倍，然而在我的模型上转为int8之后性能反而降低两倍。之前FP32模型推理一张图片时间为25ms，转为int8之后模型推理一张图片时间为55ms.

经过查找资料我发现FP32转为int8会增加一些层，如果模型卷积推理的时间不占大多数，那么转为int8之后反而会增加推理时间，达不到加速效果，那么这样的问题怎么来解决呢，还是说我这个模型就不适用int8加速

Wan_Intel · ‎08-15-2022

Hi Yujf,

Thanks for reaching out to us. May I know which device plugin are you using when you do inference with both model formats? For your information, Choose FP16, FP32 or int8 for Deep Learning Models article explores these floating point representations in more detail, and answer questions such as which precision are compatible with different hardware.

On another note, may I know which methods are you using to convert your model from FP32 format into INT8 format? Could you please share the steps on how you convert your model from FP32 format into INT8 format in detail?

Regards,

Wan

Wan_Intel · ‎08-22-2022

Hi Yujf,

We noticed that you posted a similar thread here:

https://community.intel.com/t5/forums/forumtopicpage/board-id/distribution-openvino-toolkit/message-id/28150

We would like to notify you that we will continue our conversation at the thread above.

Regards

Wan