Difference between Movidius profiler estimation and Actual running time.

idata · ‎04-11-2018

I built a small MNIST deep test to get my head around the movidius NCS. Everything went fine and the kit is promising.

However, I noticed a small thing regarding the running time, the profiler estimated the inference time on movidius to be 4.3 ms . However, when I measure the actual time from when I finish loading the tensor to the getResult stops blocking it yields 8 ms (about twice as what the profiler predicted). Are there any reasons for this? and how can I think of this to scale up for other applications I am considering where running time is a constraint.

best,

Maen

idata · ‎04-12-2018

@maen Please refer to my comment @ https://ncsforum.movidius.com/discussion/709/what-is-the-different-between-mvncprofile-inference-time-and-python-time-time

idata · ‎04-16-2018

@Tome_at_Intel Thanks for the reply. I saw your comment, but as I mentioned I measure the time after the transmission is done, that is:

gragh.LoadTensor (img, 'user object')<br />ts = time.time() <br />graph.GerResult()<br />tf = time.time()<br />print(tf-ts)

I thought that would get me the inference time only but it still differs from the inference time reported by the profiler. Are their any pre processing going on the movidius before the inference that is not calculated by the profiler?

idata · ‎04-16-2018

@maen The correct way to time inference would be to time LoadTensor() and GetResult(), not just GetResult(). Like I mentioned in my post here mvNCProfile's inference time is strictly just the time spent processing on the chip and does not account for any overhead like USB transfer time. If you measure both LoadTensor() and GetResult(), this time will usually be longer than what it takes for mvNCProfile because this will include USB transfer time and a small amount of wait time due to GetResult() being a blocking call. LoadTensor() loads the image for inference and GetResult() is a blocking call that waits for LoadTensor().