Intel® Distribution of OpenVINO™ Toolkit
Community assistance about the Intel® Distribution of OpenVINO™ toolkit, OpenCV, and all aspects of computer vision-related on Intel® platforms.
6403 Discussions

Using Inference engine with a mixed-precision model

svutran
New Contributor I
764 Views

Hi,

 

We have quantized our trained model in a way that it has both INT8 and FP16 weights.

 

For inference time, we are using OpenVINO's Inference engine to load the model from memory (not from a file path) using the following method:

CNNNetwork ReadNetwork(const std::string& model, const Blob::CPtr& weights) const;

 

Now, this method requires to create a Tensor (InferenceEngine::TensorDesc) so that we can pass our weights in it and then pass it to the InferenceEngine::make_shared_blob method. The problem here is that TensorDesc does not support Inference::Precision::MIXED.

Here is the error message that is thrown when executing the program:

Cannot make shared blob! The blob type cannot be used to store objects of current precision

 

So, how should we proceed in order to read a network where the weights are mixed-precision?

 

Here is a code snippet of how we are loading the model. Not that poBin refers to the binary content of the weights in memory.

        InferenceEngine::TensorDesc oTensor(InferenceEngine::Precision::MIXED, oWeightsContentSize, InferenceEngine::Layout::ANY);
        auto oWeightBlob = InferenceEngine::make_shared_blob(oTensor, poBin[i], uiBinSize[i]);

        InferenceEngine::CNNNetwork* poNetwork = new InferenceEngine::CNNNetwork();
        *poNetwork = oCore.ReadNetwork(std::string(reinterpret_cast<const char *>(poXml[i]),
                reinterpret_cast<const char *>(poXml[i] + uiXmlSize[i])), oWeightBlob);
 

Thank you in advance

0 Kudos
4 Replies
Zulkifli_Intel
Moderator
728 Views

Hello Svutran.

Thank you for reaching out to us.

 

Please share your mixed precisions model, your script, and any relevant information with us for further investigation.

 

Also, which OpenVINO version did you use to run with the model.

 

Sincerely,

Zulkifli  


0 Kudos
svutran
New Contributor I
703 Views

Sorry for the late reply, but putting my code here would be a little complicated.

 

Actually my question is: do we need to set the input and output precision with the following call if we are working in FP16 or mixed-precision which includes FP16:

InferenceEngine::InputsDataMap inputs_info = network.getInputsInfo();

inputs_info.second->setPrecision(Precision::FP16);

 

Thanks

0 Kudos
Zulkifli_Intel
Moderator
682 Views

Hello Svutran,

 

If you are working with FP16 or mixed-precision, particularly for int8-fp16 mixed-precision network, it can be set as below:

inputs_info.second->setPrecision(Precision::FP16);

 

Sincerely,

Zulkifli 


0 Kudos
Zulkifli_Intel
Moderator
667 Views

Hello Svutran,


Thank you for your question. If you need any additional information from Intel, please submit a new question as this thread is no longer being monitored.


Sincerely,

Zulkifli


0 Kudos
Reply