Re: Running first inference call is significantly slower

Yeoh__Ru_Sern · ‎11-07-2019

Hi,

I am facing this peculiar behavior that I cannot solve. I am using opencv with the inference engine backend to do my inferencing. I have trained my own mobilnet V2 for inferencing, and converted it to IR. I am able to do the inferencing, however the first call to net.forward() is always significantly slower than the subsequent calls. For example in milliseconds:

6190 (first inference)

23

22

23

22

23

24

25

I have initialised the network beforehand and am only measuring the tome for inference. Eg:

Net net = readNetFromModelOptimizer(xml, bin);
net.setPreferableBackend(DNN_BACKEND_INFERENCE_ENGINE);
net.setPreferableTarget(DNN_TARGET_MYRIAD);

for loop across vector containing images to inference:

{

Mat image, blob;
blob = blobFromImage(inputImg);
Mat prob;
auto start = chrono::steady_clock::now();

net.setInput(blob);
prob = net.forward();
float probF = prob.at<float>(0);
auto ending = chrono::steady_clock::now();
cout << chrono::duration_cast<chrono::milliseconds>(ending - start).count() << endl;

}

Could anyone help me figure out what I'm doing wrong? Or is there a way to properly initialise the network so that it is able to immediately do the inference at around 20+ ms from the first time?

Thanks.

Cary_P_Intel1 · ‎11-20-2019

Hi, Ru Sern,

Please refer to the "Getting Credible Performance Numbers" section in optimization document of OpenVINO.

https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html

Getting Credible Performance Numbers

You need to build your performance conclusions on reproducible data. Do the performance measurements with a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, you can use an aggregated value for the execution time for final projections:

If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.

For time values that range too much, use geomean.

Refer to the Inference Engine Samples for code examples for the performance measurements. Almost every sample, except interactive demos, has a -ni option to specify the number of iterations.

default · ‎07-14-2020

the problem