Host: Ubuntu 20.04.1 AMD64
Device: Intel Neural Compute Stick 2
Version : 2022.1.0
Build : 2022.1.0-7019-cdb9bec7210-releases/2022/1
First, I'm sorry for not providing models. Models are proprietary models.
Only information I can share are:
- Input and output size of models is 1x3x1080x1920 with half precision.
- Inference result (all numbers are represented in milliseconds):
input (Difference of before and after set_input_tensor(), invoked once before iteration) : 0.0060
total (Total inference time during 100 iteration) : 14051.5560
infer (Difference of before and after infer() per iteration): min:123.6710, max:159.7280, avg:139.0808
real (Sum of real_time in ov::ProfilingInfo that are came from get_profiling_info() per iteration) : min:36.5470, max:37.0310, avg:36.7428
It may differ each model, but inference time (difference before and after Invoke()) takes 100 ms more time than real_time from profiling API.
I checked several times that NCS2 is plugged in USB 3.0 port and cross-checked results from other computers.
Does difference of time mean data transfer time? If so, can I reduce transfer time without reducing model input size?
Thanks for reaching out to us.
Besides the data transfer time, the time used to load model to MYRIAD (NCS2) should also take into consideration.
Since you already use USB 3.0 port that helps in speeding up the data transfer compared to USB 2.0 port, then you can reduce the model load time by loading Blob file instead of Intermediate Representation (IR).
You may refer to this article for more details.
Thank you for answer my question.
I have more question while reading the answer.
You said that the time used to load model should take into consideration. Then, does MYRIAD always load same model when Infer() or StartAsync() is called? Also, I already used blob model (sorry for that I didn`t mention it). If so, does the time used to load model affect total time used to transfer data significantly?
I might misunderstand about your obtained inferencing results. My previous suggestion (loading Blob file instead of IR) can help reduce time in the whole inferencing process (load model and infer).
In this case, could you elaborate more on your inferencing results?