Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6407 Discussions

LattePanda Alpha + OpenVINO + "CPU (Core m3) vs NCS1 vs NCS2", Performance comparison

idata
Employee
5,761 Views

Hello everyone.

 

The "UNet" model did not work in NCSDK, but it worked in OpenVINO.

 

"UNet" is Semantic Segmentation model.

 

https://github.com/PINTO0309/TensorflowLite-UNet/raw/master/model/semanticsegmentation_frozen_person_32.pb

 

Interestingly, The CPU had better performance than "Neural Compute Stick" and "Neural Compute Stick 2".

 

For the moment, I do not feel utility value for NCS2.

 

 

◆Japanese Article

 

Introducing Ubuntu 16.04 + OpenVINO to Latte Panda Alpha 864 (without OS included) and enjoying Semantic Segmentation with Neural Compute Stick and Neural Compute Stick 2
0 Kudos
34 Replies
idata
Employee
1,289 Views

I tried implementing semantic segmentation with "OpenVINO + DeeplabV3 + Core m3".

 

I gained about 4-5 FPS performance.

 

However, the implementation is not beautiful. . .

 

https://github.com/PINTO0309/OpenVINO-DeeplabV3.git

 

https://youtu.be/CxxDwK7vBAo

 

https://youtu.be/-pXB3dDj-rQ

 

https://youtu.be/1NLCr5XnVX8

 

I referred to the following article.

 

https://medium.com/@oleksandrsavsunenko/optimizing-neural-networks-for-production-with-intels-openvino-a7ee3a6883d
0 Kudos
idata
Employee
1,289 Views

Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz

 

CPU Only mode.

 

https://youtu.be/TjiH2dMltl4

0 Kudos
idata
Employee
1,289 Views

Hi, Thank you all for the very detailed information. I am doing a project about image segmentation and try first with mask RCNN. However the inference time is quite long, I get about 0.5 FPS.

 

Today I came across the post of @PINTO using Deeplab and was very impressed about its speed. I get now 10 times faster (5FPS). However, I saw you said that the implementation is not beautiful. Is that mean the performance reduce a lot when we optimize it with Openvino ? How about others models like Unet ? I saw you said it is about 1FPS but do you have some update about it ?

 

Thank you in advance,

0 Kudos
idata
Employee
1,289 Views

@dhoa

 

 

However, I saw you said that the implementation is not beautiful. Is that mean the performance reduce a lot when we optimize it with Openvino ?

 

 

No.

 

The meaning of "not beautiful" is below.

 

     

  1. I have replaced OpenVINO's noncompliant layer with a pure Tensorflow call.
  2.  

  3. I call Tensorflow a total of two times, pre-processing and post-processing.
  4.  

  5. Since the OpenVINO tutorial "Offloading Computations to TensorFlow" does not function properly, some layers infer the CPU without using OpenVINO.

     

    https://software.intel.com/en-us/articles/OpenVINO-ModelOptimizer#offloading-computations-tensorflow
  6.  

 

I just want to process all the layers with the OpenVINO function.

 

And, I just wanted to say that the program is cumbersome and not concise.

 

 

How about others models like Unet ?

 

 

Because UNet works only with the OpenVINO function, it is very simple.

 

 

I saw you said it is about 1FPS but do you have some update about it ?

 

 

No. Since Intel does not publish "caffemodel" and "solver", I want to customize it, but I can not customize it.

0 Kudos
idata
Employee
1,289 Views

Thanks for your response @PINTO . Anyway in this moment Deeplab is the best choice for speed right ? It can get this performance compare to others model like Unet or Mask RCNN is because of the light architecture or because of Openvino function ? From my case with mask RCNN, I find that Openvino doesn't improve much the speed.

 

Regards,

 

Hoa
0 Kudos
idata
Employee
1,289 Views

@dhoa

 

It is the difference of the performance of the model.

 

accuracy:DeepLab Mask RCNN

 

Japanese articles DeepLab vs Mask RCNN

 

https://jyuko49.hatenablog.com/entry/2018/11/17/145904

 

However, the performance improvement of CPU inference by OpenVINO is spectacular. (with MKL-DNN)

 

It seems to be a mechanism that improves performance as CPU core number and thread number increase.

 

I think that performance comparison between different models has little meaning.

 

Accuracy and speed are always a trade-off.
0 Kudos
idata
Employee
1,289 Views

@PINTO thank you so much for your explanation. I have just one more question. How about at this stage using other framework than Tensorflow with OpenVino. Actually I am more familiar with Pytorch (Fastai) and see that OpenVino not supported yet Pytorch and to use it, one need to convert to ONNX. I have tried one time with SSD and can not run the model optimizer. What is your idea about using Pytorch and Openvino ? Sorry if it is not related much for this thread but I'm quite new in this and have idea from expert is really valuable for me.

0 Kudos
idata
Employee
1,289 Views

@dhoa

 

 

Actually I am more familiar with Pytorch (Fastai) and see that OpenVino not supported yet Pytorch and to use it, one need to convert to ONNX.

 

What is your idea about using Pytorch and Openvino ?

 

 

You do not have to stick to ONNX.

 

You can try converting to Tensorflow or Caffe with reference to the following.

 

Since the type of layer supported by OpenVINO varies depending on the conversion target framework, I recommend that you try multiple types of conversion.

 

PyTorch -> Tensorflow -> OpenVINO

 

PyTorch -> Caffe -> OpenVINO

 

https://github.com/PINTO0309/Keras-OneClassAnomalyDetection#13-model-convert
0 Kudos
idata
Employee
1,289 Views

Thank you so much @PINTO

0 Kudos
idata
Employee
1,289 Views

Thank you for sharing about performance results. I wanted to follow up with another question regarding performance. From the results above and my own tests, I understand that a i7-7700 cpu is expected to run much faster than with a single NCS. Has anyone tried running multiple NCS with an i7 cpu and observed an improved inference speed? if not,what are some possible ways to further improve inference speeds observed on an i7 cpu with open-vino installed?

 

I am trying to run inference with Mobilenet/Inception-SSD.

0 Kudos
idata
Employee
1,289 Views

@sri

 

Due to the influence of thread switching overhead, when using two NCSs, the performance will not be doubled.

 

However, you will get 1.5 times more performance.

 

Although I did not shoot movies, I actually used two or more NCS2 and confirmed the performance improvement.

 

MobileNet-SSD

 

Corei7 + NCS2 x1 (48 FPS) 320x240

 

https://github.com/PINTO0309/MobileNet-SSD-RealSense#usb-camera-mode-ncs2-x-1-stick--core-i7asynchronous-screen-drawing--multistickssdwithrealsense_openvino_ncs2py

 

https://www.youtube.com/watch?v=Nx_rVDgT8uY

 

The following is a movie that boosted full size Yolo V3 with NCS2 (x4).

 

Corei7 + NCS2 x1 (4 FPS) -> Corei7 + NCS2 x4 (13 FPS)

 

I do not strongly recommend using NCS2 if the size of the model exceeds 100 MB.

 

https://www.youtube.com/watch?v=AT75LBIOAck

 

When using multiple sticks, inference will be very fast, but the transfer rate of the image may be slow and it may become a bottleneck.

 

Depending on the resolution, for example, if you are using the USB 2.0 interface, you will not get a performance above a certain FPS.
0 Kudos
idata
Employee
1,289 Views

Thanks for replying @PINTO. Have you tried using NCS instead of NCS2? I tried doing so and I have seen a very low fps when running cpu vs cpu with ncs. It just seems that its better to not use NCS at all. i7-7700 cpu without NCS gives me a much higher inference rate than running my cpu with MYRIAD. I dont seem to see any performance improvement by running NCS with CPU. Is that something you have observed as well?

0 Kudos
idata
Employee
1,289 Views

@sri

 

 

It just seems that its better to not use NCS at all.

 

 

If NCS, you are right.

 

 

I dont seem to see any performance improvement by running NCS with CPU. Is that something you have observed as well?

 

 

Yes. That's right.

 

MYRIAD2(NCS) is too low in performance.

 

Instead, CPU optimization with OpenVINO will demonstrate a very good performance.
0 Kudos
idata
Employee
1,289 Views

@chicagobob123 @PINTO Correct me if I'm wrong, but I thought you'd get no performance gain on the Z8350 because it uses Gen8-LP. Am I right?

0 Kudos
Reply