Inference Engine Initialization

Catastrophe · ‎03-12-2020

Hi! I noticed that the "Loading of the model to device" takes up a lot of time. I need to perform series of inference so I want to obtain the "ExecutableNetwork network" once. Is there a way on how to do it?

JAIVIN_J_Intel · ‎03-17-2020

Hi,

You have to load the model only once for inference. Please refer object_detection_demo_ssd_async to load the model to the device, create infer request and to do inference on a video file.

Which model are you using? On which target device are you trying to load the model?

Different hardware platforms uses different optimization semantics and different plugins, so the time it takes to load the model also differs.

For more information, please refer to Device-Specific Optimization Guide to improve the performance.

Regards,

Jaivin

Catastrophe · ‎03-18-2020

Yes, I already tried to run that example. But in my case, I am trying to integrate this code in separate C# code. For now I run this code in the C# using a DLL but the problem is that every time I run this C# code, I always execute the loading of the model onto the device, which is taking too much time. I wonder if there is a way to just execute this once and for the next run, only the inference part will be executed.

I am using a YOLO model and CPU as the target device.

JAIVIN_J_Intel · ‎03-20-2020

Hi,

OpenVINO currently supports only C++ and Python. It doesn't provide any C# API wrappers. It seems like the DLL that you are using is the reason why it is taking more time to load the model.

Also, the model has to be loaded to IE, once in each run to do the inference part.

Regards,

Jaivin

Catastrophe · ‎03-22-2020

I already compared calling the DLL vs. running it directly in C++ but the DLL does not introduce more time in loading the model.

So, right now, there is no way that I just want to load the inference engine once, and then perform series of inference? I am thinking if the initialization part of the Inference Engine can be saved in the memory or stuff like that.

I badly need to do implement this kind of method in order to reduce the processing time.