Community
cancel
Showing results for 
Search instead for 
Did you mean: 
idata
Community Manager
981 Views

RapsberryPi3 + OpenVINO + NCS2 + Single Thread + MobileNet-SSD, Implemented.

about 10 FPS

 

https://github.com/PINTO0309/MobileNet-SSD-RealSense/blob/master/SingleStickSSDwithUSBCamera_OpenVIN...

 

It was 15 FPS in Core i7.

 

From now on, I will add an implementation of MultiProcess + MultiStick.
0 Kudos
18 Replies
idata
Community Manager
426 Views

idata
Community Manager
426 Views

Core i7 + NCS2 [21 FPS]

 

https://youtu.be/1ogge90EuqI
idata
Community Manager
426 Views

Hello PINTO,

 

I have tried to run vehicle-detection-adas-0002 on Raspberry Pi with NCS2 and it's performance was pretty poor (~4 FPS). Then I tried to run the python script on i7 machine and the performance was also poor (6.46 FPS). Then I tried to use NCS with OpenVINO and the speed was 7.44 FPS, which I don't understand, how is that possible.

 

I tested this on the video, captured with GoPRO.. haven't tested with webcam yet, but nevertheless don't know how is it possible that NCS performs better than NCS2.

 

Did you encounter any specific issues that you had to solve in order to achieve such performance? I would appreciate any suggestions in order to achieve better FPS rate.

idata
Community Manager
426 Views

@nikogamulin

 

 

Then I tried to run the python script on i7 machine and the performance was also poor (6.46 FPS).

 

 

Our benchmark knows that NCS2 is slower than Atom / Core i7 / Core i5 / Core m3.

 

Using OpenVINO on CPU seems to greatly optimize internal processing with the MKLDNN plugin.

 

NCS2 only outperforms Celeron's CPU.

 

 

Then I tried to use NCS with OpenVINO and the speed was 7.44 FPS, which I don't understand, how is that possible.

 

 

I will arrange them in order of high performance.

 

     

  1. OpenVINO + GPU (FP16)
  2.  

  3. OpenVINO + Intel's CPU (Core i7 or Core i5 or Atom) (FP32)
  4.  

  5. OpenVINO + Intel's CPU (Core i7 or Core i5 or Atom) + NCS2 (FP16)
  6.  

  7. OpenVINO + armv7l + NCS2 (FP16)
  8.  

  9. OpenVINO + armv7l + NCS (FP16)
  10.  

  11. OpenVINO + armv7l
  12.  

 

Points are described below.

 

     

  1. NCS2 is slower than Intel's CPU.
  2.  

  3. NCS / NCS2 demonstrates its power by combining it with a low-performance CPU like RaspberryPi.
  4.  

  5. When used in combination with a high performance CPU, the performance of NCS / NCS2 is very bad.
  6.  

  7. When Intel CPU is used, it seems that the inference is parallelized within the CPU by the number of cores times the number of threads.
  8.  

 

NCS and NCS2 have no meaning unless carefully selected environment to use.

idata
Community Manager
426 Views

None of this makes any sense. How is the stick useful then?

idata
Community Manager
426 Views

Pinto have you tried 3 movidius ncs 2 sticks

idata
Community Manager
426 Views

@chicagobob123

 

No, I have not tried it yet.

 

I just started implementing MultiStick since yesterday.

 

However, it is not a difficult task, so I intend to commit to Github within a few days.
idata
Community Manager
426 Views

After the vehicle example using the ncs2 worked so poorly when compared to an old i5 i was confused. Going to try and set up an atom cpu this week.

idata
Community Manager
426 Views

@chicagobob123

 

 

Pinto have you tried 3 movidius ncs 2 sticks

 

 

RaspberryPi3 + NCS2.

 

NCS2 x2 ---> 15 FPS

 

NCS2 x3 ---> 20 FPS

 

NCS2 x4 ---> 24 FPS

 

The OpenVINO API is inconvenient.

 

MultiProcess can not be used efficiently.
idata
Community Manager
426 Views

With about $300 of ncs2 sticks there is sometuing very wrong. The ROI is gone and results subpar. I don't think the sticks are useful dro you? You can but a larre panda at 299 and probably do better.

idata
Community Manager
426 Views

@chicagobob123

 

I deliberately made meaningless verification.

 

Actually, the ARM processor knew that it could not maximize the performance of NCS.

 

And I realized that ROI was the worst shortly after purchasing NCS2.

 

As you say, I understand that it is better to use LattePanda Delta / Alpha.

 

I just dared to show the worst benchmark so that world engineers will not make the wrong choice.

 

I borrowed 3 out of 4 NCS2 I used for confirmation, so the loss is small.

 

I am not making products, but a stupid hobby programmer.
idata
Community Manager
426 Views

Your anything but stupid. The results you have shown has made me move into a different direction. I am not seeing the hardware Movidius provides as useful when you want optimal performance. They are low power which may be useful and if you are not concerned with FPS, as in a doorbell video sensor I think your OK. But if you want to use it on a drone or anything that moves quickly I am not sure its useful.

 

thinking about this, do you think Movidius can speed up their chip to make it a useful co processor? I guess I just don't understand what the bottleneck is. I like the idea of a low cost coprocessor that can handle the inference.

idata
Community Manager
426 Views

@chicagobob123

 

 

as in a doorbell video sensor I think your OK. But if you want to use it on a drone or anything that moves quickly I am not sure its useful.

 

 

I think the same thing.

 

 

do you think Movidius can speed up their chip to make it a useful co processor? I guess I just don't understand what the bottleneck is. I like the idea of a low cost coprocessor that can handle the inference.

 

 

I believe that proper performance will not be obtained unless MyriadX is incorporated as SoC.

 

btw, I am interested in the following devices now.

 

https://aiyprojects.withgoogle.com/edge-tpu

 

https://www.arrow.com/en/products/eic-ms-vision-500/einfochips-limited

 

https://www.intrinsyc.com/open-q-605-single-board-computer/
idata
Community Manager
426 Views

The first tpu at least has specs. It works with

 

MobileNet V1/V2

 

224x224 max input size; 1.0 max depth multiplier

 

MobileNet SSD V1/V2

 

320x320 max input size; 1.0 max depth multiplier

 

Inception V1/V2

 

224x224 fixed input size

 

Inception V3/V4

 

299x299 fixed input size

 

Which you can get from any 640x480 camera. Hd not needed.

 

The third item is kind of pricey, 429$ seems like a lot.

idata
Community Manager
426 Views

@chicagobob123

 

Thank you, bob.

 

I think I will try the following. The price is affordable and high performance.

 

about $78

 

     

  • 16.8 TOPs @ 700mW
  •  

  • 24 TOPs/Watt
  •  

  • 16.8 TOPs @ 300MHz
  •  

  • There is a USB type development kit
  •  

 

https://www.gyrfalcontech.ai/solutions/2801s/

 

https://ja.aliexpress.com/store/product/Orange-Pi-AI-Stick-2801-Neural-Network-Computing-Stick-Artif...
idata
Community Manager
426 Views

Hi @PINTO ,

 

A few days ago, I got a NCS2, and when I run a sample image-classification demo on RaspberryPi+NCS2, I got unexpectedly bad performance. Then I found your Github and forum discussions.

 

Do you know how it is possible that on official NCS2 page, a large and different number is reported? can you run any project to confirm Movidius benchmark results?
idata
Community Manager
426 Views

@hamzeah

 

 

Do you know how it is possible that on official NCS2 page, a large and different number is reported?

 

 

Yes. I know.

 

Intel's benchmark results are obviously benchmark results other than ARM processor + USB 2.0.

 

As long as RaspberryPi3 is used, 8 times performance is absolutely not obtained.

 

Because the load of preprocessing and post-processing is high.

 

It is better to use SBC with Intel processor to maximize performance than using SBC of ARM processor.

 

OpenVINO + NCS2 is optimized for Intel processors.

 

 

can you run any project to confirm Movidius benchmark results?

 

 

I have never seen such a benchmark.

 

However, if you devise logic, 24FPS performance can be obtained with NCS2 x1 even with MobileNet-SSD + RaspberryPi3.
Reply