Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.

How to optimize models for Myriad VPU



I have a question.
I created a neural net network model for OAK-D (Myriad VPU).
I tried to execute a model optimization (Uint8) using DL Workbench to speed up this model.
However, I received a message from DL Workbench that optimization could not be performed on this device.(When converting an optimized model to a Blob file )

Unsupported layer type : FakeQuantize

I investigated. Then, it turned out that OAK-D (VPU) does not support the model optimized by Uint8 type.

Is it possible to use DL Workbench to optimize models for Myriad VPU?
Also, is there any other way to do model optimization other than DL Workbench?

0 Kudos
4 Replies

@Tatsu thank you for your question.


Your investigation had led you to the right conclusions: INT8 models cannot be executed on the VPU devices to this moment. 


However, it does not mean that there are no other techniques that you can apply to get maximum inferencing performance from your model.

You definitely need to try search for the optimal inference configuration for your model for your VPU device in the DL Workbench. You can do it by trying various combinations of batch and stream values. You can find more details on the official website: Also, you can try to reduce input shape of your model during the import stage - that might lead to better perfromance at the cost of reduced accuracy. However, it is possible only if your model is fully reshapeable. 


Can you tell more about your use case? What model are you using? What hardware are you using - single Neural Compute Stick 2 or the Intel® Vision Accelerator​ Design with Intel® Movidius™? Is VPU a strong requirement for you?



Alexander Demidovskij


Thank you for your response and research.

Is "Group Inference" a tool for changing stream and batch size combinations and performing performance evaluations?
Is it possible to get the optimized model file by using "Group Inference" in the same way as when optimizing the model?

About use case, we use the following model to estimate the position and orientation of an object from an RGBD camera. 
We need to enter point cloud information(1000points) ​and RGB image into this model.

The device is using "OAK-D (VPU)".
I want to execute inference processing as fast as possible with OAK-D without using an external GPU.


@Tatsu finding optimal configuration does not change your model, compared to INT8 calibration that produces a new, optimized, version of the model. By using Group inference you would be able to find optimal execution configuration that should be applied in the application to the way you infer your model. Would you try that? Do you need additional help? If so, reach out by GitHub or DL Workbench Gitter, or we can stay here and continue discussing any relevant topics.


BTW, where can I find the trained model? What is the input image size?



Demidovskij Alexander

Community Manager

Hi Tatsu,

Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.