Thanks for reaching out to us. May I know which device plugin are you using when you do inference with both model formats? For your information, Choose FP16, FP32 or int8 for Deep Learning Models article explores these floating point representations in more detail, and answer questions such as which precision are compatible with different hardware.
On another note, may I know which methods are you using to convert your model from FP32 format into INT8 format? Could you please share the steps on how you convert your model from FP32 format into INT8 format in detail?
We noticed that you posted a similar thread here:
We would like to notify you that we will continue our conversation at the thread above.