Intel® Distribution of OpenVINO™ Toolkit
Community support and discussions about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all things computer vision-related on Intel® platforms.

OpenVINO POT model runs slower on GPU

HesamSH
New Contributor I
561 Views

I have two models, one is vehicle-detection-0200 and the other is a custom plate finder model converted to IR format. The thing is, when I use POT tool to quantize these models, and I specify the target device to be "GPU", the quantized model runs slower!

Also when I specify the target device to be "MYRIAD",  I get this error 

RuntimeError: [ GENERAL_ERROR ]
/home/jenkins/agent/workspace/private-ci/ie/build-linux-ubuntu18/b/repos/openvino/inference-engine/src/vpu/graph_transformer/src/frontend/frontend.cpp:441 Failed to compile layer "StatefulPartitionedCall/model_1/conv2d_1/Conv2D/fq_input_0": unsupported layer type "FakeQuantize"

 

Though when I run the POT model on CPU, I get more speed.

What is the cause of these problems?

I have the POT config file attached.

0 Kudos
3 Replies
Peh_Intel
Moderator
513 Views

Hi Hesam,


Thanks for reaching out to us and sharing your POT config file.


Based on Supported Layers, FakeQuantize layer is only supported by CPU plugin. Hence, unable to run optimized model on MYRIAD Plugin is expected. However, I was also able to run optimized model on GPU Plugin but getting the model ran slower compared to FP16 model.


Besides, I also optimized a FP16 vehicle-detection-0200 model with config file by setting “ANY”, “CPU” and “GPU” as the target device respectively and measure the performance of these three INT8 models with Benchmark App. Surprisingly, all of these models have not much significant difference when inferencing on CPU and GPU Plugins.


As such, I will highlight this issue with our development team and get back to you at the earliest.



Regards,

Peh


Peh_Intel
Moderator
452 Views

Hi Hesam,


We’ve obtained insights from our development team that this HW (GPU) isn't optimized for quantized models, so the results are to be expected.

 

Currently, POT models are tested and optimized for CPU plugin only.


For GPU, we recommend you use FP16 models for better performance.

 

 

Regards,

Peh


Peh_Intel
Moderator
418 Views

Hi Hesam,


This thread will no longer be monitored since we have provided answers and suggestion. If you need any additional information from Intel, please submit a new question. 



Regards,

Peh


Reply