Community
cancel
Showing results for 
Search instead for 
Did you mean: 
Vineis__Chris
Beginner
79 Views

OpenVINO specify # of SHAVEs for NCS2/MYRIAD VPU

Hi,

When using NCSDK for programming the original NCS (MYRIAD 2), the mvNCCompile command had a -'s' option to specify the number of SHAVEs (up to 12).  I can't find anything similar in OpenVINO documentation, does it exist?  If not, then does it automatically use the max # of SHAVEs (16 for NCS2)?

Thanks.

0 Kudos
5 Replies
Shubha_R_Intel
Employee
79 Views

Dear Chris, 

Great question. Perhaps this post will help you in the meantime while I research your answer ?

https://software.intel.com/en-us/forums/computer-vision/topic/806247

Thanks for using OpenVino !

Shubha

 

Vineis__Chris
Beginner
79 Views

Hi Shubha,

Thanks for the link. I have been using the various performance checker options, which are helpful to gain insight on each layer in the network.  I haven't seen any of them indicate anything about # SHAVEs though.

Thanks,

Chris

Shubha_R_Intel
Employee
79 Views

Chris this fine-grained control (i.e. "number of shaves") is no longer available. OpenVINO selects the most optimal configuration by default. 

Hope it helps and thank you for using OpenVino !

Shubha

Vineis__Chris
Beginner
79 Views

OK, thanks for looking into this.

-Chris

Vineis__Chris
Beginner
79 Views

Hi Shubha,

As a follow up to this, just wondering if the performance enhancements of MX vs M2 I've seen using NCS and NCS2 sound reasonable.  I was expecting ~6-8x for MX vs M2, but I see more like 2-4x.

 

HP laptop with Intel Core i5-4310U CPU, 2.00 GHz x4

8 GB RAM

Ubuntu 16.04 LTS

 

Test #1: Sample code provided in Openvino kit

  • Python / “classification_sample.py”
  • Downloaded googlenet-v3 (tensorflow), used model optimizer to convert as FP32 for CPU and FP16 for Myriad
  • Tested on sample/car_1.bmp (provided in Openvino)
  • CPU: 167 ms per inference
  • M2 (NCS): 334 ms per inference
  • MX (NCS2): 86 ms per inference
  • MX vs M2: 3.9x boost

 

Test #2: custom code (U-net CNN with conv2d, max_pool, and deconvolution/upsample):

  • Tested on 256x256x4 (R/Gr/Gb/B) image
  • CPU: 216 ms
  • M2 (NCS): 374 ms
  • MX (NCS2): 209 ms
  • MX vs M2: 1.8x boost