Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

High memory consumption using Python

wsla1
Beginner
1,079 Views

I have a small neural network converted into OpenVINO format (158KB bin + 16 KB xml file). When I load it onto the CPU, using Python on Windows, and run a single inference, it consumes over 7 GB of memory. FP16 compressed model seems to be giving the same results.

What can I do to reduce memory consumption? I went through the manual, hoping there would be something about batch size or number of threads, but I couldn't find anything useful. I want to run inference on AWS Lambda, so I need to lower the memory consumption

0 Kudos
2 Replies
Megat_Intel
Moderator
1,052 Views

Hi Wsla1,

Thank you for reaching out to us.

 

For memory usage optimization, you can refer to the OpenVINO™ Toolkit Optimizing memory usage page. You might also want to check out the Advanced Throughput Options: Streams and Batching for details on OpenVINO™ Batch and Stream.

 

In addition, please refer to the OpenVINO™ Python Tutorials on configuring inference threads here.

 

On another note, I ran a Python Benchmark on the FP16 face-detection-retail-0005 (1,994 KB bin + 220 KB xml) model and it only uses 130.4 MB of memory. Could you please provide us more details (OpenVINO™ Version and CPU name), with the model that you used so that we can investigate further?

 

 

Regards,

Megat

 


0 Kudos
Megat_Intel
Moderator
974 Views

Hi Wsla1,

Thank you for your question. This thread will no longer be monitored since we have provided a suggestion. If you need any additional information from Intel, please submit a new question

 

 

Regards,

Megat


0 Kudos
Reply