Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Inference Engine Initialization

Catastrophe
Novice
748 Views

Hi! I noticed that the "Loading of the model to device" takes up a lot of time. I need to perform series of inference so I want to obtain the "ExecutableNetwork network" once. Is there a way on how to do it?

0 Kudos
4 Replies
JAIVIN_J_Intel
Employee
748 Views

Hi,

You have to load the model only once for inference. Please refer object_detection_demo_ssd_async to load the model to the device, create infer request and to do inference on a video file. 

Which model are you using? On which target device are you trying to load the model?

Different hardware platforms uses different optimization semantics and different plugins, so the time it takes to load the model also differs.

For more information, please refer to Device-Specific Optimization Guide to improve the performance.

Regards,

Jaivin

0 Kudos
Catastrophe
Novice
748 Views

Yes, I already tried to run that example. But in my case, I am trying to integrate this code in separate C# code. For now I run this code in the C# using a DLL but the problem is that every time I run this C# code, I always execute the loading of the model onto the device, which is taking too much time. I wonder if there is a way to just execute this once and for the next run, only the inference part will be executed.

I am using a YOLO model and CPU as the target device. 

0 Kudos
JAIVIN_J_Intel
Employee
748 Views

Hi,

OpenVINO currently supports only C++ and Python. It doesn't provide any C# API wrappers. It seems like the DLL that you are using is the reason why it is taking more time to load the model.

Also, the model has to be loaded to IE, once in each run to do the inference part.

Regards,

Jaivin

0 Kudos
Catastrophe
Novice
748 Views

I already compared calling the DLL vs. running it directly in C++ but the DLL does not introduce more time in loading the model.

So, right now, there is no way that I just want to load the inference engine once, and then perform series of inference? I am thinking if the initialization part of the Inference Engine can be saved in the memory or stuff like that.

I badly need to do implement this kind of method in order to reduce the processing time.

0 Kudos
Reply