Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

Running first inference call is significantly slower

Yeoh__Ru_Sern
新手
2,110 次查看

Hi,

I am facing this peculiar behavior that I cannot solve. I am using opencv with the inference engine backend to do my inferencing. I have trained my own mobilnet V2 for inferencing, and converted it to IR. I am able to do the inferencing, however the first call to net.forward() is always significantly slower than the subsequent calls. For example in milliseconds:

6190 (first inference)

23 

22

22

23

22

23

24

24

24

24

25

 

I have initialised the network beforehand and am only measuring the tome for inference. Eg:

    Net net = readNetFromModelOptimizer(xml, bin);
    net.setPreferableBackend(DNN_BACKEND_INFERENCE_ENGINE);
    net.setPreferableTarget(DNN_TARGET_MYRIAD);

 

for loop across vector containing images to inference:

{

    Mat image, blob;
    blob = blobFromImage(inputImg);
    Mat prob;
    auto start = chrono::steady_clock::now();   

    net.setInput(blob);
    prob = net.forward();
    float probF = prob.at<float>(0);
    auto ending = chrono::steady_clock::now();
    cout << chrono::duration_cast<chrono::milliseconds>(ending - start).count() << endl;

}

 

Could anyone help me figure out what I'm doing wrong? Or is there a way to properly initialise the network so that it is able to immediately do the inference at around 20+ ms from the first time?

 

Thanks.

2 回复数
Cary_P_Intel1
员工
2,110 次查看

Hi, Ru Sern,

Please refer to the "Getting Credible Performance Numbers" section in optimization document of OpenVINO. 

https://docs.openvinotoolkit.org/latest/_docs_optimization_guide_dldt_optimization_guide.html

Getting Credible Performance Numbers

You need to build your performance conclusions on reproducible data. Do the performance measurements with a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, you can use an aggregated value for the execution time for final projections:

If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.

For time values that range too much, use geomean.

Refer to the Inference Engine Samples for code examples for the performance measurements. Almost every sample, except interactive demos, has a -ni option to specify the number of iterations.

 

0 项奖励
default
初学者
2,097 次查看

the problem

0 项奖励
回复