Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.

NCS performances

idata
Employee
1,216 Views

Hi all

 

I write you to ask for a clarification, splitted in 2 questions:

 

     

  1. According to https://software.intel.com/en-us/articles/mobilenets-on-intel-movidius-neural-compute-stick-and-raspberry-pi-3, MobileNet_v1_0.25_128 can reach (unofficially) 70 fps performance when inferred on Movidius NCS. Does it mean that, instead of 70 frames in a second on a single image stream, it is possible to process 2 streams of 128x128 images at 35 fps or 4 streams at 17 fps?

  2.  

  3. Movidius NCS nominal performances is over 80 GFlops (see https://en.wikipedia.org/wiki/Movidius for instance).

     

    MobileNet_v1_0.50_160 (see https://research.googleblog.com/2017/06/mobilenets-open-source-models-for.html) asks for 77 Million MACs that - assuming 1 MAC = 2 Flops - is 154 Million

     

    Flops. 80 Giga/154 Millions is more than 500 so I would expect to be able to process 500 frames per second at 160x160 resolution with MobileNet_v1_0.50_160 on NCS, whereas (see same link reported on question 1.) actual performance is about 50 fps. So, what is wrong on my reasoning?
  4.  

 

Thank you in advance for your replies!

 

Kind regards,

 

Andrea
0 Kudos
2 Replies
idata
Employee
838 Views

@andreascaggiante

 

 

According to https://software.intel.com/en-us/articles/mobilenets-on-intel-movidius-neural-compute-stick-and-raspberry-pi-3, MobileNet_v1_0.25_128 can reach (unofficially) 70 fps performance when inferred on Movidius NCS. Does it mean that, instead of 70 frames in a second on a single image stream, it is possible to process 2 streams of 128x128 images at 35 fps or 4 streams at 17 fps?

 

 

Thanks for your interest in the NCS. Regarding your first question, you are correct with your calculations.

0 Kudos
idata
Employee
838 Views

Thank you Tome@Intel for your reply!

 

Any comment on my second question (sorry for the wrong indentation): why my second computation is wrong? where is the actual bottleneck?

 

Thank you everybody

0 Kudos
Reply