- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I use python 3.8, on Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz 2.19 GHz (2 processors)
I have model with dynamic input, saved with ONNX.
I run OPEN Vino 2023.3 to accelerate performence on CPU.
The inference time is improved. but the memory consumtion is very high and keep climbing at each inference, till the limit of 25GB.
Is there a way to improve it or clear the memory after each inference, or to control the limit?
It seem when it get to this limit it clears the memory.
Link Copied
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Yael1,
Thanks for reaching out to us.
You may refer to the OpenVINO™ Toolkit Optimizing Memory Usage for memory usage optimization during inference.
Regards,
Wan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The issue you're encountering — high and steadily increasing memory consumption during inference with OpenVINO 2023.3 on your
from openvino.inference_engine import IECore
ie = IECore()
net = ie.read_network(model="model.xml", weights="model.bin")
exec_net = ie.load_network(network=net, device_name="CPU", num_requests=1)
You can set precision to FP16 when loading the model, like so:
exec_net = ie.load_network(network=net, device_name="CPU", config={"CONFIG_KEY": "FP16"})
Use OpenVINO Model Optimizer:
The OpenVINO Model Optimizer can help convert models to a more efficient format. Using the Model Optimizer may lead to a reduction in memory usage.
python3 /opt/intel/openvino_2023.3/deployment_tools/model_optimizer/mo.py --input_model your_model.onnx
Try using OpenVINO’s Auto mode to automatically adjust the precision based on the capabilities of the CPU. This can help reduce memory overhead while maintaining performance.
Control Memory Limits
You can try to control the memory usage through system-level options or by setting environment variables for OpenVINO.
Environment variables for OpenVINO: OpenVINO exposes certain environment variables that can affect memory usage:
OPENVINO_INFERENCE_THREADS — Limits the number of threads used during inference, which can control memory consumption.
OPENVINO_CPU_BIND — Controls CPU core binding, which could indirectly affect memory usage.
export OPENVINO_CPU_BIND=ALL
Or for multi-threaded inference:
export OPENVINO_INFERENCE_THREADS=4
These variables may help fine-tune memory usage by managing the number of active threads and CPU core utilization.
Use OpenVINO's "Inference Engine" Memory Optimizations
If you're using Inference Engine for executing models, OpenVINO provides an option to optimize how memory is used during inference. You can specify the memory layout and execution parameters to ensure that resources are efficiently allocated.
For instance, you might want to try using ExecNetwork and managing the number of requests (asynchronous inference) to reduce memory spikes.
exec_net = ie.load_network(network=net, device_name="CPU", num_requests=2)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi dusktilldawn,
Thanks for sharing in the OpenVINO™ community!
Hi Yael1,
Thanks for your question.
The suggestion provided by dusktilldawn above such as utilize OpenVINO™ Model Optimizer and OpenVINO™ Inference Engine APIs can be useful if you are utilizing OpenVINO™ toolkit.
Please refer to the following links for the tutorials on using the latest OpenVINO™ Model Optimizer and OpenVINO™ Inference Engine APIs:
- https://docs.openvino.ai/2024/openvino-workflow/model-preparation/convert-model-onnx.html
- https://docs.openvino.ai/2024/openvino-workflow/running-inference/integrate-openvino-with-your-application.html
If you need additional information from Intel, please submit a new question as this thread will no longer be monitored.
Regards,
Wan

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page