Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
103 Views

Running first inference call is significantly slower

Hi,

I am facing this peculiar behavior that I cannot solve. I am using opencv with the inference engine backend to do my inferencing. I have trained my own mobilnet V2 for inferencing, and converted it to IR. I am able to do the inferencing, however the first call to net.forward() is always significantly slower than the subsequent calls. For example in milliseconds:

6190 (first inference)

23 

22

22

23

22

23

24

24

24

24

25

 

I have initialised the network beforehand and am only measuring the tome for inference. Eg:

    Net net = readNetFromModelOptimizer(xml, bin);
    net.setPreferableBackend(DNN_BACKEND_INFERENCE_ENGINE);
    net.setPreferableTarget(DNN_TARGET_MYRIAD);

 

for loop across vector containing images to inference:

{

    Mat image, blob;
    blob = blobFromImage(inputImg);
    Mat prob;
    auto start = chrono::steady_clock::now();   

    net.setInput(blob);
    prob = net.forward();
    float probF = prob.at<float>(0);
    auto ending = chrono::steady_clock::now();
    cout << chrono::duration_cast<chrono::milliseconds>(ending - start).count() << endl;

}

 

Could anyone help me figure out what I'm doing wrong? Or is there a way to properly initialise the network so that it is able to immediately do the inference at around 20+ ms from the first time?

 

Thanks.

2 Replies
Highlighted
Employee
103 Views

Hi, Ru Sern,

Please refer to the "Getting Credible Performance Numbers" section in optimization document of OpenVINO. 

https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html

Getting Credible Performance Numbers

You need to build your performance conclusions on reproducible data. Do the performance measurements with a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, you can use an aggregated value for the execution time for final projections:

If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.

For time values that range too much, use geomean.

Refer to the Inference Engine Samples for code examples for the performance measurements. Almost every sample, except interactive demos, has a -ni option to specify the number of iterations.

 

0 Kudos
Highlighted
Beginner
90 Views

the problem

0 Kudos