- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
按照官网给出的性能对比,模型由FP32转为int8模型性能应该加快两倍,然而在我的模型上转为int8之后性能反而降低两倍。之前FP32模型推理一张图片时间为25ms,转为int8之后模型推理一张图片时间为55ms.
经过查找资料我发现FP32转为int8会增加一些层,如果模型卷积推理的时间不占大多数,那么转为int8之后反而会增加推理时间,达不到加速效果,那么这样的问题怎么来解决呢,还是说我这个模型就不适用int8加速
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Yujf,
Thanks for reaching out to us. May I know which device plugin are you using when you do inference with both model formats? For your information, Choose FP16, FP32 or int8 for Deep Learning Models article explores these floating point representations in more detail, and answer questions such as which precision are compatible with different hardware.
On another note, may I know which methods are you using to convert your model from FP32 format into INT8 format? Could you please share the steps on how you convert your model from FP32 format into INT8 format in detail?
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Yujf,
We noticed that you posted a similar thread here:
We would like to notify you that we will continue our conversation at the thread above.
Regards
Wan
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page