How to free memory when the model is no longer in use

Piyush_Khandelwal · ‎04-28-2023

In my application, based on certain conditions, I want to switch between the model used. Loading all of the models is very inefficient in terms of the memory. So, when not in use I want to free the memory corresponding to that model.

My sample code:

class MY_MODEL
{

public:
    //code here

private:
    //code here
    ov::InferRequest infer_request;

    g_batchSize = 1;
};

void MY_MODEL::build()
{
    char model_xml[128];

    sprintf(model_xml,"my_model.xml");

    Core core;

    std::string FLAGS_d="CPU";
    std::shared_ptr<ov::Model> model = core.read_model(model_xml);
    
    INPUT_BLOB_NAME = model->input().get_any_name();
    OUTPUT_BLOB_NAME = model->output().get_any_name();

    ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model);

    ppp.input().model().set_layout("NCHW");

    // 1) Select input with 'input_tensor_name' tensor name 
    ov::preprocess::InputInfo& input_info = ppp.input(INPUT_BLOB_NAME);
    
    // 2) Set input type 
    input_info.tensor().
    set_element_type(ov::element::u8).
    set_layout("NHWC");
        
    // 2) Set output type
    ov::preprocess::OutputInfo& output_info = ppp.output(OUTPUT_BLOB_NAME);
    output_info.tensor().
    set_element_type(ov::element::f32).
    set_layout("?NHW");
    
    // 5) Apply preprocessing to an input with 'input_name' name of loaded model
    model = ppp.build();
    ov::set_batch(model, g_batchSize);

    ov::CompiledModel compiled_model = core.compile_model(model, "CPU");
    infer_request = compiled_model.create_infer_request();

    return true;
}

void MY_MODEL::destroy()
{
    
}

I tried various methods to free the memory when the model is not in use but none have worked so far. When trying to unload the model, I don't see all the used memory getting freed and when I go back to using the model that should have been unloaded, new memory is allocated (proving there is a memory leak). Method I've tried:

calling deconstrucer for `infer_request`
converting `infer_request` into a pointerand using `new`/`delete`
assigning new object without delete `infer_request = new InferRequest();`
converting `core`, `compiled_model`, `infer_request` to pointers and using `new`/`detele` to create/destroy the object.

code corresponding to point 4 above:

class MY_MODEL
{

public:


private:

    ov::InferRequest *infer_request;
    ov::CompiledModel *compiled_model;
    std::shared_ptr<ov::Model> model;
    Core *core;

    g_batchSize = 1;
};

void MY_MODEL::build()
{
    char model_xml[128];

    sprintf(model_xml,"my_model.xml");

    core = new Core();

    std::string FLAGS_d="CPU";
    model = core->read_model(model_xml);
    
    INPUT_BLOB_NAME = model->input().get_any_name();
    OUTPUT_BLOB_NAME = model->output().get_any_name();

    ov::preprocess::PrePostProcessor ppp = ov::preprocess::PrePostProcessor(model);

    ppp.input().model().set_layout("NCHW");

    // 1) Select input with 'input_tensor_name' tensor name 
    ov::preprocess::InputInfo& input_info = ppp.input(INPUT_BLOB_NAME);
    
    // 2) Set input type 
    input_info.tensor().
    set_element_type(ov::element::u8).
    set_layout("NHWC");
        
    // 2) Set output type
    ov::preprocess::OutputInfo& output_info = ppp.output(OUTPUT_BLOB_NAME);
    output_info.tensor().
    set_element_type(ov::element::f32).
    set_layout("?NHW");
    
    // 5) Apply preprocessing to an input with 'input_name' name of loaded model
    model = ppp.build();
    ov::set_batch(model, g_batchSize);

    compiled_model = new ov::CompiledModel();
    *compiled_model = core->compile_model(model, "CPU");

    infer_request = new ov::InferRequest();
    *infer_request = compiled_model->create_infer_request();
}

void MY_MODEL::destroy()
{
    delete infer_request;
    delete compiled_model;
    //model->~Model();
    model = NULL;
    delete core;
}

Please suggest the correct method to free the used memory. Thanks is advance!
OpenVino version: 2022

IntelSupport · ‎04-30-2023

Hi Piyush_Khandelwa,

Thanks for reaching out.

Could you share how you are detecting the memory leak and also your hardware type? Have you tried to test your model with OpenVINO Official Samples before?

Regards,

Aznie

Piyush_Khandelwal · ‎05-02-2023

Thanks for the reply.
I checked the memory leak using 'htop'. I was calling the above mentioned build() and destroy() functions repeatedly. I see an increase in total memory used every time I call the build() function. I went through the samples but I didn't see a function that I could call to free up the memory.
My system is Intel NUC 11 i5.

IntelSupport · ‎05-02-2023

Hi Piyush_Khandelwa,

Are you using the latest OpenVINO version (2022.3)? Please try to run your model with any of the Official OpenVINO Sample Apps and check if there is a memory leak.

Is it possible for you to share your models and any relevant files for us to further checking on our side.

You can share it here or privately to my email: noor.aznie.syaarriehaahx.binti.baharuddin@intel.com

Regards,

Aznie

Piyush_Khandelwal · ‎05-03-2023

Hi Aznie,

Thanks for the reply.

As I mentioned earlier the leak observed is not during inference but when I was trying to load and unload engines iteratively.

Since the official samples don't have any framework to unload the engines so I can't reproduce the same.

The methods i used to delete the memory are mentioned in the first post. Can you please let me know if I am missing something there or can you suggest a proper way to do so.

thanks and regards,

Piyush

IntelSupport · ‎05-04-2023

Hi Piyush,

It is important to make sure the model is able to be inferred and compatible with OpenVINO. If the model itself is incompatible with OpenVINO, no matter what networks/layers are used, issues are expected. As I understand that you prefer not to share the model with us, we are really limited to support you in addressing the specific issues faced by you.

Recommendation from our side is to use Model Cache feature to unload and free-up memory once you don’t need the model. You can use the API below to destroy the object.

ov::CompiledModel model = ...;

model = {};

Can you please confirm if you have validated your model with any OpenVINO Sample App - available here <Paste Reference> to check if there is a memory leak?

Regards,

Aznie

IntelSupport · ‎05-15-2023

Hi Piyush,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.

Regards,

Aznie

ekurniaw · ‎04-12-2025

For C++ on OpenVINO 2024, please try release_memory function.

https://docs.openvino.ai/2024/api/c_cpp_api/classov_1_1_compiled_model.html#_CPPv4N2ov13CompiledModel14release_memoryEv

compiled_model.release_memory();

For Python, please try release_memory function as well.

https://docs.openvino.ai/2024/api/ie_python_api/_autosummary/openvino.CompiledModel.html#openvino.CompiledModel.release_memory

Or, delete the variables and perform garbage collector collect function.

import gc
del det_model
del det_compiled_model
del det_ov_model
del core
gc.collect()

Best Regards,

Eka A. Kurniawan