I find that the inference time fluctuates a lot, is this rersult normal?

xiaoxiongli · ‎05-10-2021

I repeat this inference in a for loop 1000 times for the same test image and find that the inference time fluctuates a lot(most of them is about 15ms, but some of them is about 30ms), is this rersult normal? how to fix this?

for i in range(1000):
start2 = datetime.now()
res = exec_net.infer(inputs={input_blob: lr})
end2 = datetime.now()
print("the inference time of openvino is %6f ms" %((end2 - start2).microseconds/1000), flush=True, end=',')

the inference time of openvino is 15.610000 ms
the inference time of openvino is 31.257000 ms
the inference time of openvino is 15.624000 ms
the inference time of openvino is 15.626000 ms
the inference time of openvino is 15.627000 ms
the inference time of openvino is 16.885000 ms
the inference time of openvino is 19.045000 ms
the inference time of openvino is 9.246000 ms
the inference time of openvino is 15.664000 ms
the inference time of openvino is 15.650000 ms
the inference time of openvino is 15.622000 ms
the inference time of openvino is 15.609000 ms
the inference time of openvino is 15.626000 ms
the inference time of openvino is 15.623000 ms
the inference time of openvino is 31.250000 ms
the inference time of openvino is 15.627000 ms
the inference time of openvino is 17.522000 ms
the inference time of openvino is 10.673000 ms
the inference time of openvino is 15.634000 ms
the inference time of openvino is 16.840000 ms
the inference time of openvino is 15.657000 ms
the inference time of openvino is 15.597000 ms
the inference time of openvino is 15.631000 ms
the inference time of openvino is 15.671000 ms

my test environment is below:

- OpenVINO => 2021.3
- Operating System / Platform => Windows 64 Bit
- Compiler => Visual Studio 2017
- Problem classification: Perfermance
- Model name: fsrcnn.xml
- device: GPU Intel（R）UHD Graphics 630

Iffa_Intel · ‎05-12-2021

Greetings,

The fluctuation is expected and actually had been recorded.

Previously, there is a customer that reported the fps fluctuates over multiple runs and the issue was replicated using below configurations:

a) Setup: benchmark_app and squeezenet

b) Number of inference requests: 1, 2, 4, 8, 16, 32, 64, 128

c) Number of runs for each: 32 (eg: 32 times you will run the benchmark_app with 32 inference requests and collect the logs)

The result depict that fluctuations are to be expected for 8 to 24 concurrent inference requests. Before 8 there is always a completely free device to run inference. After 24 there is always 3 per device and each MYX is loaded with 2 in flight and 1 on data transfer. In between some tasks will get free device and some will get a device where something is already running. So performance will fluctuate depending on minor timing differences between runs.

Once again this is not a bug but a property of the platform.

Sincerely,

Iffa

xiaoxiongli · ‎05-12-2021

Dear Iffa,

Thank you very much for your reply!

you said that "b) Number of inference requests: 1, 2, 4, 8, 16, 32, 64, 128", this means you run several exe/inference simultaneously at the same time on one computer? Actually, I only run 1 exe/inference at the same time, so I think the computere' resources(cpu/gpu/memory) is free.

you said that "Before 8 there is always a completely free device to run inference. ", so 1 apps/exe/inference also got free device to run inference. So I feel confused that since the device is free, why I got this "time fluctuates a lot(most of them is about 15ms, but some of them is about 30ms)"? this is a windows 10 issue?

Because my exe is a conference application, so I need process every frame real time, the time fluctuates hurt me much, Does there have some low level( such as OS ) method to fix this?

Please help me..., thank you!

Sincerely,

xiaoxiongli

IntelSupport · ‎05-17-2021

Hi Xiaoxiongli,

Sorry for the delay in replying to you. The performance fluctuation might be happening in other OS too since this is expected between runs. What we can suggest, try to free up sufficient resources when running the inference. Sometimes, lots of background services might cause some performance delays. Meanwhile, we are still trying to find the best possible solution for you but you could try out our suggestion first.

Regards,

Aznie

xiaoxiongli · ‎05-18-2021

Dear Iffa,

Thank you very much for your reply!

And I will try, if you find any best possible solution, please let me know. thank you!

Sincerely,

xiaoxiongli

IntelSupport · ‎05-24-2021

Hi Xiaoxiongli,

Based on OpenVINO Optimization Guide documentation, you need to build your performance conclusions on reproducible data. Do the performance measurements with a large number of invocations of the same routine. Since the first iteration is almost always significantly slower than the subsequent ones, you can use an aggregated value for the execution time for final projections:

If the warm-up run does not help or execution time still varies, you can try running a large number of iterations and then average or find a mean of the results.
For time values that range too much, use geomean.

Also, refer to Benchmark App for code examples of performance measurement. Almost every sample, except interactive demos, has the -ni option to specify the number of interations.

Regards,

Aznie

IntelSupport · ‎06-03-2021

Hi Xiaoxiongli,

This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.

Regards,

Aznie