Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6403 Discussions

Increased RAM usage with DefaultQuantization

jv19
Beginner
1,158 Views

Hello,

I'm testing out OpenVINO on Mac with ssd_inception_v2 architecture. I quantized the model using DefaultQuantization on CPU. I'm getting a good inferencing speed-up (50 FPS base vs. 80 FPS quantized) via the benchmark_app script, however, the memory usage has increased from 350 MB to 485 MB. Is this expected behavior? If not, what are some potential causes for this increase in memory? Thanks! 

Labels (1)
0 Kudos
5 Replies
Munesh_Intel
Moderator
1,142 Views

Hi Justin,

Thanks for reaching out to us.

Which OpenVINO version are you using? If possible, please share your quantized model for us to reproduce your issue.


Regards,

Munesh


0 Kudos
jv19
Beginner
1,137 Views

Hi Munesh,

I am using OpenVINO 2021.1.110. Sure, I have attached the quantized .xml. I can't attach the quantized .bin however, as the forum is telling me "the file type (.bin) is not supported. Here's a mediafire link to the quantized .xml/.bin http://www.mediafire.com/folder/064ray5ygzssi/ssd_inception_v2

Best,

Justin

0 Kudos
Munesh_Intel
Moderator
1,119 Views

Hi Justin,


Thanks for sharing the files. We are investigating the issue, and will get back to you at the earliest.


Regards,

Munesh


Munesh_Intel
Moderator
1,094 Views

Hi Justin,

We are able to replicate your issue and observed higher memory consumption as well. This is not an ideally expected behavior. However, we don't have any targets for memory consumption for quantized models. Our target parameters for quantization are accuracy, throughput (FPS), and latency.

 

On top of that, SSD Inception v2 is not a validated quantized topology, as per https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_Int8Inference.html

 

Having said that, I would suggest you try applying optimization methods mentioned in the following page: https://docs.openvinotoolkit.org/latest/pot_docs_BestPractices.html

 

You can also try AccuracyAwareQuantization algorithm method.

https://docs.openvinotoolkit.org/latest/pot_compression_algorithms_quantization_accuracy_aware_README.html

 

Regards,

Munesh

 

0 Kudos
Munesh_Intel
Moderator
1,072 Views

Hi Justin,


This thread will no longer be monitored since we have provided explanation and suggestions. If you need any additional information from Intel, please submit a new question. 


Regards,

Munesh


0 Kudos
Reply