Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Too slow loading GPU with DNN models.

Byung-Hun__Park
Beginner
1,927 Views

Hi, I am developing an application using OpenVINO tookit and my several private DNN models.

There is  a probelm related to Inference Engine. It takes very long time to load GPU with the DNN models.

Currently must wiat for about tens of seconds to load GPU with 5 DNN models but just several seconds on CPU.

Isn't there any way to reduce this loading latency on GPU ?

Environment of my system is like followings :

CPU: Intel Core-I3 8100

Memory: 4GB

OS: Windows 10 64bit

Graphics Driver: Version 26.20.100.7000 of Windows 10 DCH Driver

DNN Models: Total count of 5, Total size of about 100MB

0 Kudos
6 Replies
Shubha_R_Intel
Employee
1,927 Views

Dear Byung-Hun, Park,

While your models are being loaded to the GPU plugin (clDNN), optimization is occurring during compilation of OpenCL kernels. But it should be a one time hit. It's basically taking longer for optimization to occur for the GPU than for the CPU. Really they are two different hardware platforms with entirely different optimization semantics and different plugins, so it's difficult to compare the two.

Hope this answers your question. As far as what can be done ? Well, aside from modifying your DNN to be "easier to optimize" you can step through InferenceEngine::Core::LoadNetwork() using DLDT github README to build your Inference Engine in Debug configuration. With the Open Source Version of OpenVino (dldt) you can at least narrow down which layer is taking so long to optimize, since you do have full Inference Engine source code. Unfortunately the clDNN plugin is not open-sourced though so you can't step through clDNN source code.

Also Optimization_guide_GPU_Checklist may help you.

2019R2 is now in the dldt github so I encourage you to give it a shot.

Thanks,

Shubha

0 Kudos
nikos1
Valued Contributor I
1,927 Views

>  Isn't there any way to reduce this loading latency on GPU ?

Can you try the cl_cache feature?

First time loading will be slow but then much faster.

Please check https://software.intel.com/en-us/forums/computer-vision/topic/802122

Cheers,

nikos

 

0 Kudos
nikos1
Valued Contributor I
1,927 Views

> Unfortunately the clDNN plugin is not open-sourced though so you can't step through clDNN source code.

I am confused: https://github.com/intel/clDNN

 

0 Kudos
Shubha_R_Intel
Employee
1,927 Views

Dear nikos,

I understand why that statement can be confusing, the statement that clDNN code is not open sourced. What i meant is, it's not fully open-sourced in the manner that our VPU plugin is. Yes there is some code available. But some key parts are kept hidden. The only OpenVino plugin that is fully 100% open-sourced is the VPU plugin.

Hope it helps,

Shubha

 

0 Kudos
Shubha_R_Intel
Employee
1,927 Views

Dear nikos,

I've been informed of the following:

We have our own fork of clDNN and really https://github.com/intel/clDNN version must not be developed anymore.

The correct version stays in Inference Engine only. For example in open source it is: https://github.com/opencv/dldt/tree/2019/inference-engine/thirdparty/clDNN

And the code is nearly 100% open sourced. 

Actual code base for public releases is fully open sourced, but features that are currently under development OR features for unreleased hardware are hidden.

So Byung-Hun, Park should definitely be able to deep-dive and figure out what's taking so long with the slow loading of his GPU models.

Hope it helps,

Thanks,

Shubha

0 Kudos
yamaton
Beginner
1,817 Views

Hi, I have same problem.

I tried 'interactive-face-detection' demo which included in OpenVINO and it requires about 3 minutes to start on GPU.
I tried to use CPU and MYRIAD, then it start immediately.

Has anyone tried this?
https://docs.openvinotoolkit.org/latest/openvino_docs_IE_DG_GPU_Kernels_Tuning.html

This document says that we can export tuned data for GPU and use it.

I tried this, but it still requires about 3 miutes

0 Kudos
Reply