Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.
5527 Discussions

Getting inference time of NCS2 in Async mode

BuuVo
New Contributor I
261 Views

Hi every one, I want to compute the time of an inference in the async mode to capture the moment when the throttling happen, unfortunately I met some trouble with this.

In the sync mode, the inference time is easy computed by:

begin = time.time()
outputs = exec_net.infer(inputs = {input_blob: image})
print("Inference time: {}".format(time.time() - begin))

The inference time is: 0.3678.

In the async mode, I used 4 NCS2 in my program. The following code is the script that I used to compute the inference time:

if self.device == 'MYRIAD.1.1.1.1-ma2480':		
    pre_time1 = time.time()
if self.device == 'MYRIAD.1.1.1.2-ma2480':		
    pre_time2 = time.time()
if self.device == 'MYRIAD.1.1.1.3-ma2480':		
    pre_time3 = time.time()
else:		
    pre_time4 = time.time()
self.exec_net.start_async(request_id=self.next_request, inputs={input_blob: image})                        
#Waiting for the inference task dones
if self.exec_net.requests[self.current_request_id].wait(-1) == 0:
   res = self.exec_net.requests[self.current_request_id].outputs
   if self.device == "MYRIAD.1.1.1.1-ma2480":
      infertime1 = time.time() - pre_time1
   elif self.device == "MYRIAD.1.1.1.2-ma2480":
      infertime2 = time.time() - pre_time2
   elif self.device == "MYRIAD.1.1.1.3-ma2480":
      infertime3 = time.time() - pre_time3
   else:
      infertime4 = time.time() - pre_time4
#Exchange request id
self.current_request_id, self.next_request_id = self.next_request_id, self.current_request_id

When I run this code, the inference time is always 0.01 - 0.04 second. I don't have any idea why the inference time is wrong?

How can I fix this problem to calculate a right inference time?

Thank you,

Buu.

0 Kudos
5 Replies
IntelSupport
Community Manager
241 Views

Hi Buu Vo,

What is your expected inference time? I would suggest you run the inference on a benchmark app to compare the inference time. Also, check out this documentation on how to work with Asynchronous Inference Request. My suggestion is to use infer() or start_async() function in your script. Check these functions here.

 

Regards,

Aznie


BuuVo
New Contributor I
235 Views

Dear Aznie,

I want to use 4 NCS2 simultaneously to do 4 inferences at the same  moment in 4 threads. Because infer() blocks the execution so only one inference is ran at the time, so I use Asynchronous mode to let 4 NCS2 do inference simultaneously. However, I got the problem with calculating the inference time: start the time -> do infer Async -> Wait for the inference done -> Calculating the time. The inference time for predicting an image that I got in synchronous mode is 0.37s while the inference time in the asynchronous mode is 0.04s, therefore I think there are something wrong in my script to calculate the inference time, the inference time of a prediction can not be 0.04s.

IntelSupport
Community Manager
219 Views

Hi BuuVo,

We are investigating this and will get back to you with the update at the earliest.


Regards,

Aznie



IntelSupport
Community Manager
207 Views

 

Hi Buu Vo,

According to this documentation of Object Detection SSD Python* Demo, Async API performance showcase, it is not possible to measure the execution time of the infer request that is running asynchronously, unless you measure the Wait execution immediately after StartAsync. However, this essentially would mean serialization and synchronous execution. This is what the demo does for the default "SYNC" mode and reports as the "Detection time/FPS" message on the screen. In the truly asynchronous ("ASYNC") mode, the host continues the execution in the master thread, in parallel to the infer request. And if the request is completed earlier than the Wait is called in the main thread (i.e earlier than OpenCV decoded a new frame), that reporting the time between StratAsync and Wait would obviously incorrect. That is why in the "ASYNC" mode the inference speed is not reported. The inference time will only be reported in the "SYNC" mode only. To get the inference time for "ASYNC", I would suggest you run the inference on a benchmark app to compare the inference time.

 

Regards,

Aznie


IntelSupport
Community Manager
187 Views

Hello Buu Vo,

This thread will no longer be monitored since we have provided a solution. If you need any additional information from Intel, please submit a new question.


Regards,

Aznie


Reply