Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.
5532 Discussions

Increased RAM usage with DefaultQuantization

jv19
Beginner
358 Views

Hello,

I'm testing out OpenVINO on Mac with ssd_inception_v2 architecture. I quantized the model using DefaultQuantization on CPU. I'm getting a good inferencing speed-up (50 FPS base vs. 80 FPS quantized) via the benchmark_app script, however, the memory usage has increased from 350 MB to 485 MB. Is this expected behavior? If not, what are some potential causes for this increase in memory? Thanks! 

Labels (1)
0 Kudos
5 Replies
Munesh_Intel
Moderator
342 Views

Hi Justin,

Thanks for reaching out to us.

Which OpenVINO version are you using? If possible, please share your quantized model for us to reproduce your issue.


Regards,

Munesh


jv19
Beginner
337 Views

Hi Munesh,

I am using OpenVINO 2021.1.110. Sure, I have attached the quantized .xml. I can't attach the quantized .bin however, as the forum is telling me "the file type (.bin) is not supported. Here's a mediafire link to the quantized .xml/.bin http://www.mediafire.com/folder/064ray5ygzssi/ssd_inception_v2

Best,

Justin

Munesh_Intel
Moderator
319 Views

Hi Justin,


Thanks for sharing the files. We are investigating the issue, and will get back to you at the earliest.


Regards,

Munesh


Munesh_Intel
Moderator
294 Views

Hi Justin,

We are able to replicate your issue and observed higher memory consumption as well. This is not an ideally expected behavior. However, we don't have any targets for memory consumption for quantized models. Our target parameters for quantization are accuracy, throughput (FPS), and latency.

 

On top of that, SSD Inception v2 is not a validated quantized topology, as per https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_Int8Inference.html

 

Having said that, I would suggest you try applying optimization methods mentioned in the following page: https://docs.openvinotoolkit.org/latest/pot_docs_BestPractices.html

 

You can also try AccuracyAwareQuantization algorithm method.

https://docs.openvinotoolkit.org/latest/pot_compression_algorithms_quantization_accuracy_aware_READM...

 

Regards,

Munesh

 

Munesh_Intel
Moderator
272 Views

Hi Justin,


This thread will no longer be monitored since we have provided explanation and suggestions. If you need any additional information from Intel, please submit a new question. 


Regards,

Munesh


Reply